Hi
I'm using Unity Catalog on Azure with a Managed Identity connected to my Storage Account. I can read and write data without issues, and interact with data using pyspark and SQL.
I can display the history by running a cell with the following code:
from delta.tables import DeltaTable
dt = DeltaTable.forName(spark, "CATALOG.SCHEMA.TABLE")
dt.history().display()
However, if I run the same history display command but in a new cell I get a strange error:
dt.history().display()
I get the following error:
Failure to initialize configuration for storage account NAME.dfs.core.windows.net: Invalid configuration value detected for fs.azure.account.keyInvalid configuration value detected for fs.azure.account.key
However, I can keep running the first cell multiple times and get a result displayed, but running display in a new cell throws the error.
Also trying to persist the history throws an error under certain circumstances:
from delta.tables import DeltaTable
dt = DeltaTable.forName(spark, "CATALOG.SCHEMA.TABLE")
Persist command in a new cell:
a = dt.history().persist()
Throws:
AnalysisException: [RequestId=REQUESTID ErrorClass=INVALID_PARAMETER_VALUE.LOCATION_OVERLAP] Input path url 'abfss://CONTAINER@STORAGEACCOUNT.dfs.core.windows.net/__unitystorage/catalogs/SOMEID/tables/TABLEID' overlaps with managed storage within 'GenerateTemporaryPathCredential' call
I can run the history command in SQL cell multiple times without any issues:
%sql
describe history CATALOG.SCHEMA.TABLE
TLDR: Inconsistent behavior of delta.tables.DeltaTable.history command while sql describe history works