cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Is it possible to get Job Run ID of notebook run by dbutils.notbook.run?

hanspetter
New Contributor III

When running a notebook using dbutils.notebook.run from a master-notebook, an url to that running notebook is printed, i.e.:

Notebook job #223150

Notebook job #223151

Are there any ways to capture that Job Run ID (#223150 or #223151)? We have 50 or so notebooks that runs in parallell, and if one of them fails it would be nice to see the actual run of the notebook without clicking every url to find the correct one.

Thanks 🙂

1 ACCEPTED SOLUTION

Accepted Solutions

hanspetter
New Contributor III

Hi! It's been a long time since I have looked at this problem, but I revisited it this week and found a solution. 🙂

This returns a json containing information about the notebook:

dbutils.notebook.entry_point.getDbutils().notebook().getContext().toJson() 

If the notebook has been triggered by dbutils.notebook.run, we can find the tag "jobId" here. So we can return the jobid using dbutils.notebook.exit(job_id):

import json 
notebook_info = json.loads(dbutils.notebook.entry_point.getDbutils().notebook().getContext().toJson())
try: 
    #The tag jobId does not exists when the notebook is not triggered by dbutils.notebook.run(...) 
    jobId = notebook_info["tags"]["jobId"] 
except: 
    jobId = -1 
dbutils.notebook.exit(jobId) 

When executing the notebook we get:

jobid = dbutils.notebook.run(...) 
print(jobid) 

This outputs:

Notebook job #1522478

1522478

Well, this only works when a notebook successfully completes. What we have done is that we are inserting the jobid and notebook path together with the current timestamp to a db-table and query the table when an exception occurs to get the jobid.

View solution in original post

19 REPLIES 19

AugustoElesbão
New Contributor II

@hanspetter you can get the info via the command context:

dbutils.notebook.getContext.currentRunId

Besides of that, the following methods (vals) are available in the context:

  1. jobGroup: Option[String] - unique command identifier that is injected by the driver.
  2. rootRunId: Option[RunId]
  3. tags: Map[String, String] - attribution tags injected by the webapp

Is there any development here, as currentRunId doesn't seem to work(i'm working in python though)

@naman1994​  check which Databricks Runtime version you're using. I know that this works with 3.5, but didn't test with 4.0 yet.

mine's 3.5 LTS, and it says, "NotebookHandler instance has no attribute 'getContext' "

staydol
New Contributor II

In Python you should write smth like:

dbutils.notebook.entry_point.getDbutils().notebook().getContext().currentRunId().toString()

It 'll be helpful if you could tell where you foundcurrentrunid, jobGroup, rootrunid etc, cz i cant seem to find these...

Anyways Thanks

This code works in Scala only, there is no implementation for it in the Python runtime. And the information I got from a Databricks employee via email, those are internal APIs and they may change without notice.

Okay, thank you 🙂

In scala, have you ever returned this job run id to another notebook, since im calling many notebooks and i want the job run id returned from the respective notebooks to master. I think I could do it by returning an array, But im not sure if that is possible, cz If scala is like C++, then I wont be able to at all..

Anyways, i would appreciate your thoughts over this.

Thanks

Anonymous
Not applicable

FYI - for a Python notebook, you can get the run id in Scala and pass it in via the temp table:

Cell N:%scala val runId = dbutils.notebook.getContext .currentRunId .getOrElse(System.currentTimeMillis() / 1000L) .toString

Seq(runId).toDF("run_id").createOrReplaceTempView("run_id")Cell N + 1:

runId = spark.table("run_id").head()["run_id"]

ShaunRyan1
New Contributor II

Definitely needs to be a Feature request. The Run ID is good, we know it's a notebook and therefore it should be the Notebook name not just the word "Notebook"

Also this dbutils.notebook.getContext.currentRunId

Is yielding a different number, when print out on success!

dbutils.notebook.exit(s"${notebook} #${dbutils.notebook.getContext.currentRunId} : success")
  • Notebook job #7890 - giving
  • MyNotebookName #Some(RunId(8188)) : success

hanspetter
New Contributor III

Hi! It's been a long time since I have looked at this problem, but I revisited it this week and found a solution. 🙂

This returns a json containing information about the notebook:

dbutils.notebook.entry_point.getDbutils().notebook().getContext().toJson() 

If the notebook has been triggered by dbutils.notebook.run, we can find the tag "jobId" here. So we can return the jobid using dbutils.notebook.exit(job_id):

import json 
notebook_info = json.loads(dbutils.notebook.entry_point.getDbutils().notebook().getContext().toJson())
try: 
    #The tag jobId does not exists when the notebook is not triggered by dbutils.notebook.run(...) 
    jobId = notebook_info["tags"]["jobId"] 
except: 
    jobId = -1 
dbutils.notebook.exit(jobId) 

When executing the notebook we get:

jobid = dbutils.notebook.run(...) 
print(jobid) 

This outputs:

Notebook job #1522478

1522478

Well, this only works when a notebook successfully completes. What we have done is that we are inserting the jobid and notebook path together with the current timestamp to a db-table and query the table when an exception occurs to get the jobid.

Well this has helped me a lot but i was unable to retrive browserhostname when i was running the notebook. while is was present documentation.. can you please through some light on this

DungTran
New Contributor II

this is a more simple version:

dbutils.notebook.entry_point.getDbutils().notebook().getContext().jobId().get()

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group