Hi all,
I've been trying to make use of some of the more recent tools for debugging in Databricks: pdb in the Databricks web interface with the variable explorer described in this article.I've also been trying to debug locally using the VSCode extension as mentioned in this article.
However, both approaches to debugging have been unsuccessful for the following reasons:
- pdb doesn't seem to work with streaming as I am unable to enter commands like continue etc if I am in a notebook that uses streaming as the command box doesn't come up (I can see the execution paused if I look at the cluster logs though).
- The VS Code extension depends on Databricks Connect for the local debug functionaily. Connect does not seem to yet support RDDs or the SparkContext object so I can't proceed with this route either.
Am I out of luck if I want to set breakpoints and see the variable values in my code? It is becoming very time consuming only being able to debug with print statements and console logs as the code becomes more complex.
Any suggestions would be greatly appreciated.
Many thanks