09-28-2023 11:05 AM
My team has started to see long run times on cells when using the %run commands to run another notebook. The notebook that we are calling with %run only contains variable setting, defining functions, and library imports. In some cases I have seen in excess of 10+ minutes but this isn't behavior I would expect without actually running anything.
Has anyone else run into this and how have you resolved it?
10-06-2023 01:00 AM
Hi @cmilligan ,
- Long run times with %run
command could be due to notebook size and complexity, Databricks cluster load, and network latency.
- %run
command executes another notebook immediately, making its functions and variables available in the calling notebook.
- Execution time can increase if there are many or complex operations in the notebook.
- To resolve this issue:
dbutils.notebook.run()
instead of %run
: It starts a new job to run the notebook which might be more efficient. But it doesn't make the functions and variables of the called notebook available in the calling notebook.
- Example of using dbutils.notebook.run()
:
python
dbutils.notebook.run("My Other Notebook", 60)
10-06-2023 05:40 AM
Thank you for the reply @Kaniz_Fatma ,
I don't think the issue is with the performance of the notebook that we're calling with %run. The only things in this notebook are re-usable python functions and simple variable setting (text strings, passwords, static lists, etc.). A majority of the time when running this notebook using %run it takes a couple of seconds or less. Lately it seems that when there is high demand on the cluster it can take upwards of 10 minutes or we will just experience connection timeout failures. This tends to happen more on this command when running as part of a larger job or another job is running on the cluster.
10-11-2023 12:16 PM
@Kaniz_Fatma it seems like the issue is accessing secrets in the scope. I was testing with a user that doesn't have access to the secret scope which is one of the first commands in the notebook. I would expect it to fail quickly since they don't have access but it still continues to run for a long time.
10-12-2023 12:07 AM
Hi @cmilligan , Can you try this with another user? Also, with a different notebook and cluster? What is the DBR version now running?
10-12-2023 05:53 AM
Hi @Debayan,
I've tried this against multiple users and notebooks. We've also used multiple clusters one with 10.4 LTS and the other with 13.3 LTS. The issue is still happening
10-15-2023 10:01 PM
Hi, Do you see anything suspicious in the log4j section of the cluster driver logs?
10-23-2023 08:14 AM
@Debayan I'm not really sure, I haven't read the logs before, and looking through it I'm not sure if there is something that should stand out or not
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group