Databricks Jobs API - Throttling
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-11-2025 04:23 PM - edited 03-11-2025 04:25 PM
Dear all,
I am planning to execute a script that fetches databricks jobs status every 10 minutes. I have around 500 jobs in my workspace. The APIs I use are listed below - list runs, get all job runs.
I was wondering if this could cause throttling as there are rate limits on jobs apis. I would like to know if there are better ways to handle this use case, apart from introducing logic to handle throttling.
On a side note, if throttling occurs, will the other important jobs in the workspace fail (say, fail to launch)? or they will be just retried once throttling disappears.
- Labels:
-
Delta Lake
-
Spark
-
Workflows
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-17-2025 01:17 AM
Different limitations are implemented at API endpoints. The "/jobs/runs/list" has a limitation of 30 requests/second. The number of concurrent task executions is limited up to 2000. These limits work separately, so the job list API rate limit can return 429 response, but it should not block the execution of a new job.
https://docs.databricks.com/aws/en/resources/limits#api-rate-limits
If you have about 500 jobs, your script can call the API endpoint about 20 times per second. Which is lower than the limit, but if you have more jobs in the future, it may encounter the limit.
Alternatively, depending on your requirements, system tables may be helpful. For example, you can query more job runs at once by the following SQL statement:
SELECT * FROM job_run_timeline
WHERE workspace_id = "<workspace-id>"
AND period_start_time >= "2025-03-15T09:00:00"
AND period_end_time <= "2025-03-15T10:00:00"

