Connect to Onelake using Service Principal, Unity Catalog and Databricks Access Connector
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
3 weeks ago - last edited 3 weeks ago
We are trying to connect Databricks to OneLake, to read data from a Fabric workspace into Databricks, using a notebook. We also use Unity Catalog. We are able to read data from the workspace with a Service Principal like this:
from pyspark.sql.types import *
from pyspark.sql.functions import *
# Credentials
client_id = xxx
tenant_id = xxx
client_secret = xxx
spark.conf.set("fs.azure.account.auth.type", "OAuth")
spark.conf.set("fs.azure.account.oauth.provider.type", "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider")
spark.conf.set("fs.azure.account.oauth2.client.id", client_id)
spark.conf.set("fs.azure.account.oauth2.client.secret", client_secret)
spark.conf.set("fs.azure.account.oauth2.client.endpoint",f"https://login.microsoftonline.com/{tenant_id}/oauth2/token")
# Define the Onelake parameters
lakehouse_name = "testlakehouse01"
workspace_name = "fabrictest"
fullpathtotablesinworkspace = f"abfss://{workspace_name}@onelake.dfs.fabric.microsoft.com/{lakehouse_name}.Lakehouse/Tables/"
tablename = "publicholidays"
publicholidaysdf = spark.read.format("delta").load(f"{fullpathtotablesinworkspace}/{tablename}")
display(publicholidaysdf.limit(10))
As per this documentation: https://learn.microsoft.com/en-us/azure/databricks/connect/unity-catalog/#path-based-access-to-cloud..., we need / want (?) to use an external location instead of the URI, because we use Unity Catalog, right?
We tried to 'mount' the OneLake tables using the access connector we already have (storage based) to Databricks, but get errors.
Using the gui:
Using a cluster:
PERMISSION_DENIED: The contributor role on the storage account is not set or Managed Identity does not have READ permissions on url abfss://fabrictest@onelake.dfs.core.windows.net/testlakehouse01.Lakehouse/Tables. Please contact your account admin to update the storage credential. PERMISSION_DENIED: Failed to authenticate with the configured service principal. Please contact your account admin to update the configuration. exceptionTraceId=a5e324b9-3bb7-4663-b1cb-8143f30cf830 SQLSTATE: 42501
Is the URI correct?
The error message on a cluster implies we have to grant permissions on the OneLake storage, but how? And where exactly?
Thanx,
Judith

