MoJaMa
Databricks Employee
Databricks Employee

I don't think there is anything native for this in Databricks. The closest match would have been system tables (system.lakeflow.job_run_timeline / job_task_run_timeline) but I don't think it will have the necessary grain for what your pattern.

There's probably two different ways to try and think about it.

Approach 1:

  • Enable Change Data Feed on your status Delta table: ALTER TABLE SET TBLPROPERTIES (delta.enableChangeDataFeed = true).
  • Create a Lakebase Postgres instance and a synced table in Continuous mode — it replicates the Delta table to Postgres with a minimum refresh interval of ~15 seconds.
  • Your backend API queries Postgres directly with a plain SELECT COUNT(*) FILTER (WHERE status='COMPLETED'), COUNT(*) FROM tasks WHERE run_id = ? — millisecond latency, no Databricks SQL warehouse spin-up cost per request, and no progress logic in the processing job. Lakebase supports up to 1,000 concurrent connections, so you can poll from the frontend safely.
  • Bonus: same Lakebase instance can back other operational lookups for your app.

Approach 2:

  •  Point your backend at a small serverless SQL warehouse and call the Statement Execution API (https://docs.databricks.com/api/workspace/statementexecution) with a parameterized aggregate query keyed by run_id.
  •  Cache results in your backend for a few seconds to avoid hammering the warehouse.
  •  Trade-off: serverless warehouse cold-starts and per-query latency are higher than Postgres; fine for a progress bar polled every 3–10s, less ideal if you need sub-second updates.

you can continue using your existing job-success/fail webhook for the terminal signal. Use approaches above only for the in-flight 200/500 progress updates. That avoids hammering anything when the run is idle.

~Mohan Mathews, Lead DSA.

View solution in original post