cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Debugging options if you are using streaming, RDDs and SparkContext?

Shenstone
New Contributor

Hi all,

I've been trying to make use of some of the more recent tools for debugging in Databricks: pdb in the Databricks web interface with the variable explorer described in this article.I've also been trying to debug locally using the VSCode extension as mentioned in this article.

However, both approaches to debugging have been unsuccessful for the following reasons:

  • pdb doesn't seem to work with streaming as I am unable to enter commands like continue etc if I am in a notebook that uses streaming as the command box doesn't come up (I can see the execution paused if I look at the cluster logs though).
  • The VS Code extension depends on Databricks Connect for the local debug functionaily. Connect does not seem to yet support RDDs or the SparkContext object so I can't proceed with this route either.

Am I out of luck if I want to set breakpoints and see the variable values in my code? It is becoming very time consuming only being able to debug with print statements and console logs as the code becomes more complex.
Any suggestions would be greatly appreciated.

Many thanks

0 REPLIES 0

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group