cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

Connect to Onelake using Service Principal, Unity Catalog and Databricks Access Connector

Judith
New Contributor III

We are trying to connect Databricks to OneLake, to read data from a Fabric workspace into Databricks, using a notebook. We also use Unity Catalog. We are able to read data from the workspace with a Service Principal like this:

from pyspark.sql.types import *
from pyspark.sql.functions import *

# Credentials
client_id = xxx
tenant_id = xxx
client_secret = xxx

spark.conf.set("fs.azure.account.auth.type", "OAuth")
spark.conf.set("fs.azure.account.oauth.provider.type", "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider")
spark.conf.set("fs.azure.account.oauth2.client.id", client_id)
spark.conf.set("fs.azure.account.oauth2.client.secret", client_secret)
spark.conf.set("fs.azure.account.oauth2.client.endpoint",f"https://login.microsoftonline.com/{tenant_id}/oauth2/token")

# Define the Onelake parameters
lakehouse_name = "testlakehouse01"
workspace_name = "fabrictest"

fullpathtotablesinworkspace = f"abfss://{workspace_name}@onelake.dfs.fabric.microsoft.com/{lakehouse_name}.Lakehouse/Tables/"
tablename = "publicholidays"
publicholidaysdf = spark.read.format("delta").load(f"{fullpathtotablesinworkspace}/{tablename}")
display(publicholidaysdf.limit(10))

As per this documentation:  https://learn.microsoft.com/en-us/azure/databricks/connect/unity-catalog/#path-based-access-to-cloud..., we need / want (?) to use an external location instead of the URI, because we use Unity Catalog, right?
We tried to 'mount' the OneLake tables using the access connector we already have (storage based) to Databricks, but get errors.

Using the gui:

Judith_0-1739892045239.png

 

Judith_1-1739891020619.png

Using a cluster:
PERMISSION_DENIED: The contributor role on the storage account is not set or Managed Identity does not have READ permissions on url abfss://fabrictest@onelake.dfs.core.windows.net/testlakehouse01.Lakehouse/Tables. Please contact your account admin to update the storage credential. PERMISSION_DENIED: Failed to authenticate with the configured service principal. Please contact your account admin to update the configuration. exceptionTraceId=a5e324b9-3bb7-4663-b1cb-8143f30cf830 SQLSTATE: 42501

Is the URI correct?
The error message on a cluster implies we have to grant permissions on the OneLake storage, but how? And where exactly?

Thanx,

Judith

 

5 REPLIES 5

behema1074
New Contributor II

Hi, I am facing the same problem. Have you already been able to solve the problem?

adriennn
Valued Contributor

UC is now displaying a new error when trying to add an external location pointing to a onelake abfss path. The error says that onelake urls are not supported as external locations.

nayan_wylde
Esteemed Contributor

One thing you can check if the Databricks Access connector have storage blob contributor access on the Datalake

mark_ott
Databricks Employee
Databricks Employee

To connect Databricks to OneLake using Unity Catalog and access data with a service principal, and to address the "PERMISSION_DENIED" error you encountered, here are the key points and steps:

Use External Location with Unity Catalog

  • When using Unity Catalog, you typically do not access cloud storage directly by URI. Instead, you create an external location in Unity Catalog that references your OneLake storage path.

  • External locations allow controlled, managed access to storage with permissions enforced via Unity Catalog rather than raw storage permissions.

  • Creating and managing external locations requires appropriate privileges in the metastore (e.g., metastore admin or external location owner role).​

Permissions for OneLake Storage Access

  • The "PERMISSION_DENIED" error indicates that the service principal does not have sufficient permissions on the OneLake storage.

  • You need to grant your Databricks service principal both Azure RBAC roles and OneLake workspace access:

    • At the Azure level, assign your service principal roles like Storage Blob Data Contributor or Storage Account Contributor on the OneLake storage account or relevant resource group.

    • Within OneLake (Microsoft Fabric workspace), assign the contributor role or equivalent access for the service principal to the target Fabric workspace or lakehouse.

  • The access control model for OneLake uses deny-by-default, so explicit granting in both Azure portal (IAM role assignments) and Fabric workspace access control is required.​

Steps to Grant Permissions

  1. In the Azure portal, go to your OneLake storage account or resource group.

  2. Open "Access Control (IAM)" and add a role assignment for your service principal with the role Storage Blob Data Contributor or Storage Account Contributor.

  3. In the Fabric portal, navigate to the target workspace, open Manage Access, and add your service principal with at least the Contributor role so it can access lakehouse data.

  4. Confirm that the service principal has the necessary permissions to authenticate and read from the storage URI.

Using External Location in Databricks

  • Create an external location in Databricks referencing your OneLake path using the same service principal/credential.

  • Assign this external location to the appropriate workspace(s).

  • Use Unity Catalog tables via this external location for fine-grained access control rather than mounting the storage manually.​

About Mounting OneLake Storage

  • Mounting OneLake storage via DBFS is generally not recommended when using Unity Catalog.

  • Instead, use external locations tied to Unity Catalog and the service principal access model for secured, governed data access.

  • Mount attempts often fail with permission errors due to missing Contributor roles or managed identity rights on the OneLake storage account.​


This guidance should help resolve permission issues and align with best practices using Unity Catalog external locations for OneLake data in Databricks

Coffee77
Contributor III

As commented you need to assign "Storage Blob Data Contributor or Storage Account Contributor to the service principal you're using in the "connection" provided to the "external location". 

Another more advanced and even better option would be to use the "managed identity" associated to an "Azure Access Connector for Databricks" so that you can avoid usage of secrets or passwords. That "managed identity" should be provided with same roles.

I explain that in this video but it's only in spanish so far. Maybe I'll make it in english soon 🙂 https://youtu.be/HSSWP5UbkNY?si=DdzKx-KGJJQUXb3k 


Lifelong Learner Cloud & Data Solution Architect | https://www.youtube.com/@CafeConData
¿Cansado de gestionar credenciales en Azure? 🔑 En este video te explico de forma clara y simple cómo funcionan las Managed Identities en Azure - la forma más segura (y automática) de autenticar servicios sin contraseñas. Aprenderás: 🚀 Qué es una Managed Identity (y por qué deberías usarla). 🔐 ...