cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks Jobs API - Throttling

noorbasha534
Contributor III

Dear all,

I am planning to execute a script that fetches databricks jobs status every 10 minutes. I have around 500 jobs in my workspace. The APIs I use are listed below - list runs, get all job runs.

I was wondering if this could cause throttling as there are rate limits on jobs apis. I would like to know if there are better ways to handle this use case, apart from introducing logic to handle throttling.

On a side note, if throttling occurs, will the other important jobs in the workspace fail (say, fail to launch)? or they will be just retried once throttling disappears.

 

 

# Function to get all job runs within the date range
def get_all_job_runs(start_time, end_time)
    all_runs = []
    has_more = True
    offset = 0
    limit = 25  # Adjust the limit as needed

    while has_more:
        job_runs = db.jobs.list_runs(
            active_only=False,
            start_time_from=start_time,
            start_time_to=end_time,
            offset=offset,
            limit=limit
        )
        all_runs.extend(job_runs['runs'])
        has_more = job_runs.get('has_more', False)
        offset += limit

    return all_runs

# Get all job runs for the given date range
job_runs = get_all_job_runs(start_time, end_time)
1 REPLY 1

koji_kawamura
Databricks Employee
Databricks Employee

Hi @noorbasha534 

Different limitations are implemented at API endpoints. The "/jobs/runs/list" has a limitation of 30 requests/second. The number of concurrent task executions is limited up to 2000. These limits work separately, so the job list API rate limit can return 429 response, but it should not block the execution of a new job.

https://docs.databricks.com/aws/en/resources/limits#api-rate-limits

If you have about 500 jobs, your script can call the API endpoint about 20 times per second. Which is lower than the limit, but if you have more jobs in the future, it may encounter the limit.

Alternatively, depending on your requirements, system tables may be helpful. For example, you can query more job runs at once by the following SQL statement:

SELECT * FROM job_run_timeline
WHERE workspace_id = "<workspace-id>"
AND period_start_time >= "2025-03-15T09:00:00"
AND period_end_time <= "2025-03-15T10:00:00"