Alberto_Umana
Databricks Employee
Databricks Employee

Hi @noorbasha534,

 

  • The approach you mentioned for error handling in PySpark using PySparkException is a valid method. It allows you to catch specific exceptions related to PySpark operations and handle them accordingly.

  • Logging errors into tables is advisable, especially if you plan to create an operational dashboard to monitor and analyze errors. Having an errors table with columns for the date, error class, message parameter, and SQL state can simplify the process of querying and reporting errors.

  • Transitioning from logging errors as ".txt" files in an ADLS storage account to logging them into tables can indeed simplify reporting. Table-based error logging allows for more straightforward querying and analysis using SQL, which can be more efficient than periodically profiling storage accounts and containers.