Hi everybody,
I tested the temporary table credentials API. I works great, as long as I use the credentials outside of Databricks (e.g. in a local duckdb instance).
But as soon as I try to use the short living credentials (Azure SAS for me) in Databricks, e.g. in a notebook, it doesn't work anymore:
1. duckdb: complains "AzureBlobStorageFileSystem could not open file: unknown error occurred, this could mean the credentials used were wrong."
2. azure-storage-blob python package: "Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature."
3. spark, read abfss url directly: "ErrorClass=INVALID_PARAMETER_VALUE.LOCATION_OVERLAP] Input path url overlaps with managed storage within 'CheckPathAccess' call. ."
The third one made me think. Is it that within Databricks the access to known managed storage locations is blocked for all kinds of libraries, even when accessing with a temporary credential? this would mean temporary credentials could only be used outside of databricks. And therefore it would not be possible read the data in databricks with any other engine than spark?
And if not: has anybody made duckdb run in databricks, directly accessing the data in the metastore?
(I know that I could always get from pyspark to pandas/polars/arrow/duckdb/.., but I would be interested in skipping pyspark, especially when amounts of data a rather small)