- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-21-2025 11:19 PM
Hi Community,
I’ve been working on a data pipeline using Databricks Delta Live Tables (DLT) with Unity Catalog. My setup saves data into Azure Data Lake Storage (ADLS) for streaming tables. However, I’ve noticed that Unity Catalog encrypts various components, including the schema name, table name, and the data itself, making it unreadable in the ADLS container.
Now, I want to create a shortcut or establish access to these tables in Microsoft Fabric for further use and analysis.
Here are a few specific details:
- The tables are managed through Unity Catalog in Databricks.
- The data resides in ADLS but appears encrypted and not directly readable outside Databricks.
- I want to seamlessly access or link these tables in MS Fabric while maintaining security and compatibility.
I’d appreciate any suggestions or best practices on how to:
- Create a shortcut or link to these tables in MS Fabric.
- Configure MS Fabric to work with Unity Catalog-managed tables stored in ADLS.
- Overcome challenges related to encryption and schema compatibility.
Thanks in advance for your insights!
Looking forward to hearing your ideas and experiences.
- Labels:
-
Delta Sharing
-
Unity Catalog
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-28-2025 07:38 AM
Hi Community,
I previously reached out regarding creating shortcuts in Microsoft Fabric for Databricks Delta Live Tables (DLT) managed through Unity Catalog, specifically when the data resides in Azure Data Lake Storage (ADLS) and appears encrypted.
After some experimentation, I’ve found a solution that allows seamless access to these Unity Catalog-managed tables in Microsoft Fabric while ensuring compatibility and security.
Solution for Creating Shortcuts in Microsoft Fabric:
Create a Schema in Microsoft Fabric:
Set up a schema in Fabric that matches the catalog and schema used in Databricks for consistency.Add a New Shortcut in the Schema:
- Navigate to the newly created schema.
- Click on "New Shortcut" and select ADLS Gen 2 as the data source.
Set Up the Connection:
Use an existing connection to your ADLS Gen2 storage or create a new one. Ensure you configure the proper credentials and access permissions.Choose Path View:
While adding the shortcut, select "Path View" instead of "Browse View".Provide Shortcut Details:
- Specify the shortcut name.
- Add the ADLS URL and sub-path for the Delta tables.
- To automate retrieving the sub-paths, use the provided script below
Feel free to reach out if you have further questions or suggestions to enhance this approach!
Code -
def get_delta_table_location(table_catalog: str, table_schema: str, table_name: str) -> str:
"""
Get the location of a Delta table using its catalog, schema, and table name.
Parameters:
table_catalog (str): Catalog name of the table.
table_schema (str): Schema name of the table.
table_name (str): Table name.
Returns:
str: The location of the Delta table in ADLS.
"""
full_table_name = f"{table_catalog}.{table_schema}.{table_name}"
df = spark.sql(f"DESCRIBE TABLE EXTENDED {full_table_name}")
location_row = df.filter("col_name = 'Location' and data_type LIKE 'abfss://%' ").first()
if not location_row:
raise Exception(f"Location not found for table: {full_table_name}")
return location_row["data_type"]
# List of tables to process
tables = [
{"catalog": "your_catalog", "schema": "your_schema", "table": "table_1"},
{"catalog": "your_catalog", "schema": "your_schema", "table": "table_2"}
]
# Initialize an empty list for storing results
table_paths = []
# Process each table and fetch its location
for table in tables:
try:
location = get_delta_table_location(table["catalog"], table["schema"], table["table"])
# Transform location to a relative path as needed
relative_path = location.replace("abfss://your_container@your_adls_account.dfs.core.windows.net/", "/your_container/")
table_paths.append((table["catalog"], table["schema"], table["table"], relative_path))
except Exception as e:
table_paths.append((table["catalog"], table["schema"], table["table"], f"Error: {str(e)}"))
# Convert results into a Spark DataFrame for visualization
from pyspark.sql import Row
result_df = spark.createDataFrame(
[Row(Catalog=row[0], Schema=row[1], Table_Name=row[2], Path=row[3]) for row in table_paths]
)
# Display the results
result_df.display()
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-22-2025 08:00 AM
This seems like where you'd want to use Delta Sharing, rather than going against the table files directly. I see in a few tutorials Microsoft says to make a shortcut directly to the storage, but you won't have the Unity Catalog management if you connect directly to the tables in the filesystem. In Databricks documentation, they recommend against accessing table files directly for a number of reasons.
Fabric is still an immature service so I didn't find much on using Delta Sharing with Fabric. There is a connector for data flow, and Power BI has a connector, so Microsoft does have support. You may be the first ones to try this, so best I can suggest is let everyone know how it works.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-22-2025 08:12 AM - edited 01-22-2025 08:12 AM
And also, if you get this to work, I'll remind you that the call for presentations for DAIS 2025 is open until Feb 3 and you should submit a talk about connecting Fabric and Databricks because I think a lot of people would want to know that.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-22-2025 08:28 AM
@RjdudleyThank you for your suggestion! I will follow up on it. It seems that registration is not open yet.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-22-2025 11:15 AM
You submit talks before registration opens: Call for Presentations — Data + AI Summit 2025 | Databricks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-28-2025 07:38 AM
Hi Community,
I previously reached out regarding creating shortcuts in Microsoft Fabric for Databricks Delta Live Tables (DLT) managed through Unity Catalog, specifically when the data resides in Azure Data Lake Storage (ADLS) and appears encrypted.
After some experimentation, I’ve found a solution that allows seamless access to these Unity Catalog-managed tables in Microsoft Fabric while ensuring compatibility and security.
Solution for Creating Shortcuts in Microsoft Fabric:
Create a Schema in Microsoft Fabric:
Set up a schema in Fabric that matches the catalog and schema used in Databricks for consistency.Add a New Shortcut in the Schema:
- Navigate to the newly created schema.
- Click on "New Shortcut" and select ADLS Gen 2 as the data source.
Set Up the Connection:
Use an existing connection to your ADLS Gen2 storage or create a new one. Ensure you configure the proper credentials and access permissions.Choose Path View:
While adding the shortcut, select "Path View" instead of "Browse View".Provide Shortcut Details:
- Specify the shortcut name.
- Add the ADLS URL and sub-path for the Delta tables.
- To automate retrieving the sub-paths, use the provided script below
Feel free to reach out if you have further questions or suggestions to enhance this approach!
Code -
def get_delta_table_location(table_catalog: str, table_schema: str, table_name: str) -> str:
"""
Get the location of a Delta table using its catalog, schema, and table name.
Parameters:
table_catalog (str): Catalog name of the table.
table_schema (str): Schema name of the table.
table_name (str): Table name.
Returns:
str: The location of the Delta table in ADLS.
"""
full_table_name = f"{table_catalog}.{table_schema}.{table_name}"
df = spark.sql(f"DESCRIBE TABLE EXTENDED {full_table_name}")
location_row = df.filter("col_name = 'Location' and data_type LIKE 'abfss://%' ").first()
if not location_row:
raise Exception(f"Location not found for table: {full_table_name}")
return location_row["data_type"]
# List of tables to process
tables = [
{"catalog": "your_catalog", "schema": "your_schema", "table": "table_1"},
{"catalog": "your_catalog", "schema": "your_schema", "table": "table_2"}
]
# Initialize an empty list for storing results
table_paths = []
# Process each table and fetch its location
for table in tables:
try:
location = get_delta_table_location(table["catalog"], table["schema"], table["table"])
# Transform location to a relative path as needed
relative_path = location.replace("abfss://your_container@your_adls_account.dfs.core.windows.net/", "/your_container/")
table_paths.append((table["catalog"], table["schema"], table["table"], relative_path))
except Exception as e:
table_paths.append((table["catalog"], table["schema"], table["table"], f"Error: {str(e)}"))
# Convert results into a Spark DataFrame for visualization
from pyspark.sql import Row
result_df = spark.createDataFrame(
[Row(Catalog=row[0], Schema=row[1], Table_Name=row[2], Path=row[3]) for row in table_paths]
)
# Display the results
result_df.display()

