cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks workflows for APIs with different frequencies (cluster keeps restarting)

mordex
New Contributor III
 
 

Title: Databricks workflows for APIs with different frequencies (cluster keeps restarting)

Hey everyone,

I’m stuck with a Databricks workflow design and could use some advice.

Currently, we are calling 70+ APIs 

Right now the workflow looks something like:
Task1 → Task2 → ForEach → notebook (API calls)

However, there is a new requirement that each API needs to be called at a different frequency — some must run every 1 min, some 2 mins, some 5 mins. And we have to create a generalized solution. 

In task 1 we are reading a view, where all API's, path and their apicallfreq is stored.

We’re using job clusters, and the problem is:

  • Cluster spins up
  • Runs the job
  • Terminates immediately
  • Next run starts → spins up again

So for 1-min jobs, it’s basically constantly restarting clusters, which is not really feasible (time + cost).

We looked into:

  • Continuous jobs → but that doesn’t really work for us because we need task dependencies + ForEach
  • Cron scheduling → same issue, cluster keeps terminating after each run

Has anyone handled something similar?

  • Did you move everything into a single notebook and manage scheduling inside?
  • Use an all-purpose cluster instead?
  • Or is there a better pattern for handling different API frequencies?

Would really appreciate any practical suggestions from real-world setups.

Thanks!

 
 
 
 
1 REPLY 1

lingareddy_Alva
Esteemed Contributor

Hi @mordex 

This is a classic high-frequency orchestration problem on Databricks. The core issue is that Databricks job clusters are designed for batch workloads, not sub-5-minute polling loops. 

Job clusters have a ~3–5 min cold start. For a 1-min frequency API, you're spending more time starting the cluster than running the actual work. This is fundamentally the wrong tool for that cadence.

Option 1: All-Purpose Cluster with Internal Scheduling (Most Practical)

Keep your existing workflow structure but run it on an all-purpose cluster that stays alive. Inside the notebook, manage the frequency loop yourself:

 Option 2: Databricks Jobs + Delta Queue Pattern (Most Robust)

 

A dispatcher job that writes due API calls into a Delta table queue
A long-running worker job that polls the queue and executes calls
The queue table DDL
The dispatcher logic with the SQL to identify due APIs


Option 3: Structured Streaming with foreachBatch
If your APIs can be modeled as a micro-batch stream, Structured Streaming on an all-purpose cluster handles variable-frequency polling natively.

Given you already have a view-based config and ForEach pattern, Option 2 (Delta Queue) is the cleanest long-term solution. But if you need something working fast, Option 1 (all-purpose cluster + threading) gets you there today with minimal refactoring.

The key insight is: separate the scheduling decision from the execution. Your view already stores the frequency — you just need a lightweight process that reads it and dispatches work, rather than having the cluster itself restart to make that decision.

 

LR