Databricks Community

matthiasn · yesterday

Hi everybody,

I tested the temporary table credentials API. I works great, as long as I use the credentials outside of Databricks (e.g. in a local duckdb instance).

But as soon as I try to use the short living credentials (Azure SAS for me) in Databricks, e.g. in a notebook, it doesn't work anymore:

1. duckdb: complains "AzureBlobStorageFileSystem could not open file: unknown error occurred, this could mean the credentials used were wrong."
2. azure-storage-blob python package: "Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature."
3. spark, read abfss url directly: "ErrorClass=INVALID_PARAMETER_VALUE.LOCATION_OVERLAP] Input path url overlaps with managed storage within 'CheckPathAccess' call. ."

The third one made me think. Is it that within Databricks the access to known managed storage locations is blocked for all kinds of libraries, even when accessing with a temporary credential? this would mean temporary credentials could only be used outside of databricks. And therefore it would not be possible read the data in databricks with any other engine than spark?

And if not: has anybody made duckdb run in databricks, directly accessing the data in the metastore?

(I know that I could always get from pyspark to pandas/polars/arrow/duckdb/.., but I would be interested in skipping pyspark, especially when amounts of data a rather small)

Walter_C · yesterday

Within Databricks, access to known managed storage locations might be restricted for all kinds of libraries, even when using temporary credentials. This could explain why you are facing issues with DuckDB, the Azure Storage Blob Python package, and Spark when trying to access data with temporary credentials.

If direct access using DuckDB is not feasible, you might consider using Spark to read the data and then converting it to a format that DuckDB can consume. This approach, although not ideal, can help you work around the current limitations.

matthiasn · 10 hours ago

As I said, I know that it would work using Spark to read the data.
That there might be restrictions that could restrict the access is obvious, I was asking if anybody could confirm those or if somebody managed to use temporary credentials to read inside a notebook.

Databricks Community

Use temporary table credentials to access data in Databricks

Connect with Databricks Users in Your Area

Submit your feedback and win a $50 gift card!

Share Your Feedback in Our Community Survey

Databricks Named a Leader in the 2024 Gartner® Magic Quadrant™ for Cloud Database Management Systems

Announcing the new Meta Llama 3.3 model on Databricks

Milestone: DatabricksTV Reaches 100 Videos!