cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Notebook cell gets hung up but code completes

tim-mcwilliams
New Contributor III

Have been running into an issue when running a pymc-marketing model in a Databricks notebook. The cell that fits the model gets hung up and the progress bar stops moving, however the code completes and dumps all needed output into a folder. After the code completes I have to then detach the notebook since hitting Interrupt doesn't respond. I took a peek at the cluster logs and can confirm everything runs as expected (see screenshot!).

Any ideas the issue here or have you run into the same issue??

2 REPLIES 2

Kaniz
Community Manager
Community Manager

Hi @tim-mcwilliams,

It sounds like youโ€™re encountering a situation where the notebook cell appears to hang while running a pymc-marketing model in Databricks, but the code was eventually completed successfully.

Letโ€™s explore some potential reasons for this behaviour:

  1. Resource Constraints:

    • Check if your Databricks cluster has sufficient resources (CPU, memory, and disk space) to handle the model-fitting process. If the cluster is under-provisioned, it might cause the progress bar to stall even though the code continues executing.
    • Consider increasing the cluster resources or using a larger instance type.
  2. Concurrency and Parallelism:

    • Databricks Notebooks execute cells in parallel by default. If other cells are running concurrently, they might compete for resources and cause the progress bar to hang.
    • Try running the model in an isolated notebook or at a time when other cells are not executing.
  3. Interrupt Signal Handling:

    • The fact that hitting โ€œInterruptโ€ doesnโ€™t respond suggests that the notebook might not be handling the interrupt signal properly.
    • Check if there are any custom signal handlers or other code that interfere with the default behaviour of interrupting a cell.
    • You can also try restarting the kernel or detaching the notebook as youโ€™ve been doing.
  4. Code Execution and Output:

    • Since the code completes successfully and dumps the output into a folder, it seems that the actual computation is working as expected.
    • Verify that the output files are correct and contain the expected results.
  5. Databricks Environment and Dependencies:

    • Ensure that all necessary dependencies (including pymc-marketing) are correctly installed in your Databricks environment.
    • Check for any conflicting libraries or versions that might cause unexpected behavior.
  6. Cluster Logs and Monitoring:

    • Continue monitoring the cluster logs to see if any specific errors or warnings occur during the execution of the model.
    • Look for any patterns or clues that might help identify the issue.

Hopefully, youโ€™ll find a solution soon! ๐Ÿ˜Š๐Ÿš€.

 

tim-mcwilliams
New Contributor III

Hi @Kaniz , 

Thanks for the feedback here as well as on the other discussion forum. I've commented on your trouble shooting tips on that board. One thing to touch upon here

  • Verify that the output files are correct and contain the expected results.
    • all of the modeling outputs are as expected and correct