Re: ISSUE: PySpark task exception handling on "Shared Compute" cluster

Data Engineering

Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.

I am experiencing an issue with a PySpark job that behaves differently depending on the compute environment in Databricks. And this is blocking us from deploying the job into the PROD environment for our planned release.

Specifically:

- When running the job on a personal cluster, everything works as expected. All exceptions within the try/catch blocks are successfully caught and handled.
- However, when I run the same job on a shared cluster, it fails, and no exceptions are being caught by the try/catch blocks.

Any guidance or insights you could provide would be greatly appreciated.

Example (running piece of code in a workspace notebook): https://github.com/user-attachments/assets/78b38c5a-98f6-4bb0-82c3-45946d6c5500

Any ideas? @andrews