Hello,
I've recently noticed we've never been using Analyze Table, after doing z-ordering / liquid clustering investigations and noticing the query plans for our delta tables were not considering these paths.
I'm trying to execute the following command to trigger statistics for our delta tables
spark.sql(f"ANALYZE TABLE delta.my_table_path COMPUTE DELTA STATISTICS") (my_table_path is backticked)
my_table_path is an abfss path, we are not using unity catalogue currently.
The error being received is
WARN FileSystem: Failed to initialize filesystem my_table_path: Failure to initialize configuration for storage account XXXXXXXX.dfs.core.windows.net: Invalid configuration value detected for fs.azure.account.keyInvalid configuration value detected for fs.azure.account.key
However, we can successfully run commands against this table path such as
spark.sql(f"DESCRIBE DETAIL delta.my_table_path").show()
In addition to this reading/writing/doing optimize are also all working, and I was able to deep clone the source data to this location in order to do all this testing.
Does anyone know what might be at play here? Does Analyze use some elevated permissions on the blob storage that we're running into for example?
In addition to this, I believe running the Analyze command is key to not seeing our execution plans be optimized to use z-ordering or liquid clustering, is this a correct assumption? Currently the execution plan ignores all of these despite doing optimize operations.
Thanks in advance if you're able to look at this!