Dear all,
I am planning to execute a script that fetches databricks jobs status every 10 minutes. I have around 500 jobs in my workspace. The APIs I use are listed below - list runs, get all job runs.
I was wondering if this could cause throttling as there are rate limits on jobs apis. I would like to know if there are better ways to handle this use case, apart from introducing logic to handle throttling.
On a side note, if throttling occurs, will the other important jobs in the workspace fail (say, fail to launch)? or they will be just retried once throttling disappears.
# Function to get all job runs within the date range
def get_all_job_runs(start_time, end_time)
all_runs = []
has_more = True
offset = 0
limit = 25 # Adjust the limit as needed
while has_more:
job_runs = db.jobs.list_runs(
active_only=False,
start_time_from=start_time,
start_time_to=end_time,
offset=offset,
limit=limit
)
all_runs.extend(job_runs['runs'])
has_more = job_runs.get('has_more', False)
offset += limit
return all_runs
# Get all job runs for the given date range
job_runs = get_all_job_runs(start_time, end_time)