cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Any recommended way for a different app to start their dependent job based on Databricks job?

PradeepPrabha
New Contributor III

How can we configure a job in a different Azure application to be triggered after the completion of an Azure Databricks job? Once the Databricks job is successful, the job in the third-party application hosted in Azure should start. I attempted to use the default webhook notification available in Databricks, which performs an HTTP POST, but I couldn't find useful information regarding the job in the RequestBody of the WebhookData parameter. Do you have any suggestions?

2 ACCEPTED SOLUTIONS

Accepted Solutions

Louis_Frolio
Databricks Employee
Databricks Employee

Greetings @PradeepPrabha , I did some digging and here is what I found.

Short answer: Use Databricks job notifications with an HTTP webhook that points to a lightweight receiver in Azure (for example, an Azure Function or a Logic App). The webhook payload includes workspace_id, job_id, and run_id. Your receiver uses run_id to call the Databricks Jobs API, fetch full run details, and then trigger the downstream job. Make sure the receiver returns a 2xx response quickly to avoid retries and duplicate events.

Recommended pattern (event-driven)

  1. Configure a system destination (one-time, admin only)

    In Admin Settings โ†’ Notifications โ†’ Add destination, choose Webhook (or Slack, Teams, PagerDuty if that fits your use case). For webhooks, you can configure basic auth. Databricks enforces HTTPS and requires certificates from a trusted CA.

  2. Attach the destination to your job

    Open the job, go to Job notifications, and add notifications for Start, Success, and/or Failure. Select the destination you created. You can configure up to three destinations per event.

  3. Build a simple receiver in Azure (Function or Logic App)

    Parse the incoming JSON payload to extract workspace_id, job_id, and run_id.

    Use run_id to call Jobs Runs Get (and optionally Runs Get Output) to retrieve status, task-level details, and error messages.

    Trigger your third-party job using its API.

Return HTTP 2xx within roughly five seconds and offload any heavier work asynchronously. If you donโ€™t, Databricks will retry the notification and youโ€™ll often see two or three duplicates on failures. Add idempotency logic keyed on run_id to safely dedupe.

Example payload (start, success, or failure):

{
  "event_type": "jobs.on_success",
  "workspace_id": "your_workspace_id",
  "run": { "run_id": "12345" },
  "job": { "job_id": "67890", "name": "job_name" }
}

Tip

If you need richer context than what the webhook provides, the supported approach today is to โ€œfan outโ€ using run_id and job_id via the Jobs API. Fully custom webhook payloads arenโ€™t supported for job notifications, so enrichment via API calls is the intended pattern.

Common gotchas

Networking and allowlisting: If you restrict inbound traffic, make sure Databricks control plane IPs used for notifications are allowlisted. Several โ€œsilent delivery failureโ€ cases boil down to this.

Slack or Teams formatting: Donโ€™t build logic that depends on message structure. If you need a stable schema, use a generic webhook and enrich the payload yourself via the Jobs API.

Alternatives

Call the third-party API directly from the Databricks job on success, for example as a final notebook task using requests. This is often the simplest approach if you control the job code.

Use an external orchestrator for cross-system dependencies. Azure Data Factory can run Databricks jobs and then continue downstream in the same pipeline. Apache Airflow has first-class Databricks operators and can schedule follow-on work once a Databricks run completes.

 

Why the webhook felt โ€œemptyโ€

This is by design. Databricks keeps the job notification payload intentionally minimalโ€”event_type, workspace_id, job_id, and run_id. The expectation is that you use those identifiers to query the Jobs API for full context (tasks, timings, errors) and then dispatch whatever downstream action you need.

 

Hope this helps, Louis.

View solution in original post

PradeepPrabha
New Contributor III

Thank you.

 

 

Thank you for the detailed answer!

I have tested the Azure function way and also using an Azure runbook as well. Both works fine.

Also tested the option of adding as the final task and a condition to "if all other notebooks" successful, then proceed and then wrote an HTTP POST to an Azure function 

View solution in original post

2 REPLIES 2

Louis_Frolio
Databricks Employee
Databricks Employee

Greetings @PradeepPrabha , I did some digging and here is what I found.

Short answer: Use Databricks job notifications with an HTTP webhook that points to a lightweight receiver in Azure (for example, an Azure Function or a Logic App). The webhook payload includes workspace_id, job_id, and run_id. Your receiver uses run_id to call the Databricks Jobs API, fetch full run details, and then trigger the downstream job. Make sure the receiver returns a 2xx response quickly to avoid retries and duplicate events.

Recommended pattern (event-driven)

  1. Configure a system destination (one-time, admin only)

    In Admin Settings โ†’ Notifications โ†’ Add destination, choose Webhook (or Slack, Teams, PagerDuty if that fits your use case). For webhooks, you can configure basic auth. Databricks enforces HTTPS and requires certificates from a trusted CA.

  2. Attach the destination to your job

    Open the job, go to Job notifications, and add notifications for Start, Success, and/or Failure. Select the destination you created. You can configure up to three destinations per event.

  3. Build a simple receiver in Azure (Function or Logic App)

    Parse the incoming JSON payload to extract workspace_id, job_id, and run_id.

    Use run_id to call Jobs Runs Get (and optionally Runs Get Output) to retrieve status, task-level details, and error messages.

    Trigger your third-party job using its API.

Return HTTP 2xx within roughly five seconds and offload any heavier work asynchronously. If you donโ€™t, Databricks will retry the notification and youโ€™ll often see two or three duplicates on failures. Add idempotency logic keyed on run_id to safely dedupe.

Example payload (start, success, or failure):

{
  "event_type": "jobs.on_success",
  "workspace_id": "your_workspace_id",
  "run": { "run_id": "12345" },
  "job": { "job_id": "67890", "name": "job_name" }
}

Tip

If you need richer context than what the webhook provides, the supported approach today is to โ€œfan outโ€ using run_id and job_id via the Jobs API. Fully custom webhook payloads arenโ€™t supported for job notifications, so enrichment via API calls is the intended pattern.

Common gotchas

Networking and allowlisting: If you restrict inbound traffic, make sure Databricks control plane IPs used for notifications are allowlisted. Several โ€œsilent delivery failureโ€ cases boil down to this.

Slack or Teams formatting: Donโ€™t build logic that depends on message structure. If you need a stable schema, use a generic webhook and enrich the payload yourself via the Jobs API.

Alternatives

Call the third-party API directly from the Databricks job on success, for example as a final notebook task using requests. This is often the simplest approach if you control the job code.

Use an external orchestrator for cross-system dependencies. Azure Data Factory can run Databricks jobs and then continue downstream in the same pipeline. Apache Airflow has first-class Databricks operators and can schedule follow-on work once a Databricks run completes.

 

Why the webhook felt โ€œemptyโ€

This is by design. Databricks keeps the job notification payload intentionally minimalโ€”event_type, workspace_id, job_id, and run_id. The expectation is that you use those identifiers to query the Jobs API for full context (tasks, timings, errors) and then dispatch whatever downstream action you need.

 

Hope this helps, Louis.

PradeepPrabha
New Contributor III

Thank you.

 

 

Thank you for the detailed answer!

I have tested the Azure function way and also using an Azure runbook as well. Both works fine.

Also tested the option of adding as the final task and a condition to "if all other notebooks" successful, then proceed and then wrote an HTTP POST to an Azure function