Although the PySpark documentation states that DataFrame.foreachPartition() is a shorthand for df.rdd. foreachPartition(), there is an important difference when running on Databricks shared clusters (especially with Unity Catalog and Spark Connect).D...
This issue appears to be related to Azure Storage access through Unity Catalog rather than the data itself, especially since the same workload was working fine with Hive and the failure is intermittent.A few areas worth checking:1. Storage Credential...
Following are may be the Causes1. Different authentication methods- The UI's external location uses Unity Catalog credentials- Your dbutils.fs.ls() command uses the compute's Spark configurations- These may be using different credentials with differe...