- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-01-2026 01:27 PM
Title: Databricks workflows for APIs with different frequencies (cluster keeps restarting)
Hey everyone,
I’m stuck with a Databricks workflow design and could use some advice.
Currently, we are calling 70+ APIs
Right now the workflow looks something like:
Task1 → Task2 → ForEach → notebook (API calls)
However, there is a new requirement that each API needs to be called at a different frequency — some must run every 1 min, some 2 mins, some 5 mins. And we have to create a generalized solution.
In task 1 we are reading a view, where all API's, path and their apicallfreq is stored.
We’re using job clusters, and the problem is:
- Cluster spins up
- Runs the job
- Terminates immediately
- Next run starts → spins up again
So for 1-min jobs, it’s basically constantly restarting clusters, which is not really feasible (time + cost).
We looked into:
- Continuous jobs → but that doesn’t really work for us because we need task dependencies + ForEach
- Cron scheduling → same issue, cluster keeps terminating after each run
Has anyone handled something similar?
- Did you move everything into a single notebook and manage scheduling inside?
- Use an all-purpose cluster instead?
- Or is there a better pattern for handling different API frequencies?
Would really appreciate any practical suggestions from real-world setups.
Thanks!