ISSUE: PySpark task exception handling on "Shared Compute" cluster

geronimo_signol — Thu, 29 Aug 2024 13:15:27 GMT

I am experiencing an issue with a PySpark job that behaves differently depending on the compute environment in Databricks. And this is blocking us from deploying the job into the PROD environment for our planned release.

Specifically:

- When running the job on a personal cluster, everything works as expected. All exceptions within the try/catch blocks are successfully caught and handled.
- However, when I run the same job on a shared cluster, it fails, and no exceptions are being caught by the try/catch blocks.

Any guidance or insights you could provide would be greatly appreciated.

Example (running piece of code in a workspace notebook): https://github.com/user-attachments/assets/78b38c5a-98f6-4bb0-82c3-45946d6c5500

Any ideas? @andrews

topic ISSUE: PySpark task exception handling on "Shared Compute" cluster in Data Engineering

ISSUE: PySpark task exception handling on "Shared Compute" cluster