topic Re: get job run link based on the job name or the submit body in Data Engineering

get job run link based on the job name or the submit body

ctiwari7 — Tue, 10 Sep 2024 11:28:32 GMT

This is the current code(ignore indentations) that I am using which takes the list of all the running jobs and then filters from the list to get the run id of the matching job name. I want to know if there is any better way to optimise this.

Legacy databricks cli being used, 0.17.8

cmd = ["databricks", "runs", "list", "--output", "json"]
output = subprocess.run(cmd, capture_output=True) # noqa: S607,S603
stdout = output.stdout.decode("utf-8")
runs = json.loads(stdout)

run_name = submit_body["run_name"]
spark_python_task = submit_body["spark_python_task"]

matching_run = None
for _run in runs["runs"]:
if _run["run_name"] == run_name and _run["task"]["spark_python_task"] == spark_python_task:
matching_run = _run
break

Re: get job run link based on the job name or the submit body

szymon_dybczak — Tue, 10 Sep 2024 12:03:29 GMT

Hi @ctiwari7 ,

I don't know if this is a better approach, because it's a very subjective matter, but you can try to use 2 alternative approaches:

1. system tables - > Jobs system table reference | Databricks on AWS

2. REST API calls to first:

- get a list of all job names and their respective ids using list jobs REST API endpoint

List jobs | Jobs API | REST API reference | Azure Databricks

- use the job runs endpoint to get active job runs with all required information. Then you can associate job_run with job_name using job_id atribute
List job runs | Jobs API | REST API reference | Azure Databricks

Re: get job run link based on the job name or the submit body

ctiwari7 — Tue, 26 Nov 2024 15:47:44 GMT

even the rest API also provides the job details based on the job id which I would need to get from the job_name that I have. This seems like the only possible solution since job_id is the true identifier of any workflow job considering we can have multiple jobs with same name.