โ05-01-2024 10:35 AM
Have been running into an issue when running a pymc-marketing model in a Databricks notebook. The cell that fits the model gets hung up and the progress bar stops moving, however the code completes and dumps all needed output into a folder. After the code completes I have to then detach the notebook since hitting Interrupt doesn't respond. I took a peek at the cluster logs and can confirm everything runs as expected (see screenshot!).
Any ideas the issue here or have you run into the same issue??
โ05-09-2024 10:04 AM
Hi @Retired_mod,
Thank you for replying with some trouble shooting steps. Really appreciate it! I've added some more context below in red.
โ05-10-2024 08:02 AM
Hey @tim-mcwilliams,
got exactly, I mean exactly the same problem. Have you found any solution?
โ05-10-2024 11:03 AM
Hey @Piotrus321 ,
I have not found any solution as of yet. I've been messing with cluster configs, but it seems to be a bigger problem here than compute power.
โ05-13-2024 06:10 AM
Hey @tim-mcwilliams
I think I've found a solution that seems to work. It's seems that py-mc marketing displayed output somehow crashed the databricks cell. I disabled it by adding %%capture at the beginning of the cell and ; at the end of the cell.
โ05-14-2024 09:37 AM
Hey @Piotrus321
Good find! I gave that a try but unfortunately I am getting the same behavior. I added %%capture to both the beginning and end of the cell that run the model fitting code. The cell ran for about an hour and a half, while I was doing some other work. Came back to it and canceled the cell, but it still hung up on me.
My data isn't big, about 4 months worth with about 6 variables. The same model run in about 1.5 mins on my local machine.
โ05-14-2024 10:15 AM - edited โ05-14-2024 10:18 AM
This can be a frustrating situation where the notebook cell appears stuck, but the code execution actually finishes in the background. Here are some steps you can troubleshoot to resolve this: camzap bazoocam
1. Restart vs Interrupt:
2. Check for Deadlocks:
3. Identify Long-Running Processes:
4. Resource Constraints:
5. Concurrency and Parallelism:
6. Logging and Debugging:
7. Update Libraries and Restart Kernel:
8. Consider Alternatives:
Monday
hi @tim-mcwilliams, Did you manage to fix the issue or identify the root cause?
It would be really helpful to know. Thanks a lot.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group