Unable to read excel files in the Azure databricks (UC enabled workspace)

vishaldevarajan
New Contributor II

Hello,

After adding the maven library com.crealytics:spark-excel_2.12:0.13.5 under the artifact allowlist, I have installed it at the Azure databricks cluster level (shared, unity catalog enabled, runtime 15.4). Then I tried to create a df for the excel file as follows,
df = spark.read.format('com.crealytics.spark.excel').option('header',True).load('abfss://dev@dlake.dfs.core.windows.net/myfiles/file1.xlsx')

And while trying to display the above df, or trying to see the columns using df.columns, I got the following error,

[INSUFFICIENT_PERMISSIONS] Insufficient privileges: User does not have permission SELECT on any file. SQLSTATE: 42501 File <command-6451936430536458>, line 1----> 1 df.columns File /databricks/spark/python/pyspark/sql/connect/client/core.py:2155, in SparkConnectClient._handle_rpc_error(self, rpc_error) 2140 raise Exception( 2141 "Python versions in the Spark Connect client and server are different. " 2142 "To execute user-defined functions, client and server should have the " (...) 2151 "https://docs.databricks.com/en/release-notes/serverless.html" target="_blank" rel="noopener noreferrer">https://docs.databricks.com/en/release-notes/serverless.html</a>.</span><span>&quot;</span> 2152 ) 2153 # END-EDGE-> 2155 raise convert_exception( 2156 info, 2157 status.message, 2158 self._fetch_enriched_error(info), 2159 self._display_server_stack_trace(), 2160 ) from None 2162 raise SparkConnectGrpcException(status.message) from None 2163 else

 

Additional information:
1. Since it's a Unity catalog-enabled workspace, I'm following the external location and not the traditional mount point for the ADLS gen2 where the excel files are present. In a different workspace (non-unity catalog enabled), I was able to create the df successfully for the same code, but the file location is based on the mount point path.

2. Also tried reading excel file from Volume. Still the same error.

3. I was able to create a df for the csv, json files in the same adls gen2 as follows,

df = spark.read.format('csv').option('header',True).load('abfss://dev@dlake.dfs.core.windows.net/bom/validity/BOM_VALIDITY.csv')

I'm not having any issues and was able to display the df. I have the following permissions over the external location,  create external table, read files and write files.

 

What's wrong, and kindly help me in this regard. TIA.