cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

TaskSensor - check if task is succeded

kmodelew
New Contributor II

Hi,

I would like to check if the task within job is succeded (even the job is marked as failed because on of the tasks).

I need to create dependency for tasks within other jobs. The case is that I have one job for loading all tables for one country. Report uses some tables from job. 

I'm looking something similar to ExternalTasSensor in Airflow.

What is the best practice in this area? I see the option to use API but maybe you see other, better possibilites.

1 ACCEPTED SOLUTION

Accepted Solutions

Kaniz_Fatma
Community Manager
Community Manager

Hi @kmodelew

  • Databricks Jobs now supports task orchestration, allowing you to run multiple tasks as a directed acyclic graph (DAG). This simplifies the creation, management, and monitoring of your data and machine learning workflows.
  • You can easily orchestrate tasks using the Databricks UI and API, making it accessible to data scientists and analysts. This feature is available at no additional cost.
  • Benefits include:
  • By default, Databricks Jobs runs when their dependencies have succeeded. However, you can configure tasks to run conditionally based on specific criteria.
  • For example, you can set up an โ€œIf/else conditionโ€ task that runs only when certain conditions are met. This allows for more fine-grained control over task execution 23.
  • While Databricks doesnโ€™t have an exact equivalent to Airflowโ€™s ExternalTaskSensor, you can achieve similar functionality using the ExternalTaskSensor in Databricks.
  • The ExternalTaskSensor waits for a task in a different DAG (or even a different Databricks workspace) to complete before proceeding with the current task.
  • You can use this sensor to create dependencies across different jobs or even different Databricks workspaces.
  • Itโ€™s particularly useful when you have interdependent tasks across different workflows 45
  • If you prefer programmatic control, you can use Databricks APIs to query the status of tasks or jobs.
  • For example, you can use the Jobs API to check the status of specific tasks within a job.

View solution in original post

2 REPLIES 2

Kaniz_Fatma
Community Manager
Community Manager

Hi @kmodelew

  • Databricks Jobs now supports task orchestration, allowing you to run multiple tasks as a directed acyclic graph (DAG). This simplifies the creation, management, and monitoring of your data and machine learning workflows.
  • You can easily orchestrate tasks using the Databricks UI and API, making it accessible to data scientists and analysts. This feature is available at no additional cost.
  • Benefits include:
  • By default, Databricks Jobs runs when their dependencies have succeeded. However, you can configure tasks to run conditionally based on specific criteria.
  • For example, you can set up an โ€œIf/else conditionโ€ task that runs only when certain conditions are met. This allows for more fine-grained control over task execution 23.
  • While Databricks doesnโ€™t have an exact equivalent to Airflowโ€™s ExternalTaskSensor, you can achieve similar functionality using the ExternalTaskSensor in Databricks.
  • The ExternalTaskSensor waits for a task in a different DAG (or even a different Databricks workspace) to complete before proceeding with the current task.
  • You can use this sensor to create dependencies across different jobs or even different Databricks workspaces.
  • Itโ€™s particularly useful when you have interdependent tasks across different workflows 45
  • If you prefer programmatic control, you can use Databricks APIs to query the status of tasks or jobs.
  • For example, you can use the Jobs API to check the status of specific tasks within a job.

@Kaniz_Fatma 

Thank you. In my case I plan to use API:

https://docs.databricks.com/api/workspace/jobs/listruns - for choosing the latest run_id

https://docs.databricks.com/api/workspace/jobs/getrun - for getting task status for run_id

Python script will run every 10 minutes in certain predefined period of time and check the status of the tasks.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group