cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

viewing managed delta table files

smpa01
Contributor

I am getting an error when I am trying to view the underlying files of the managed delta table in unity catalog

such as

from pyspark.sql.functions import *

table_directory = "workspace.db_bronze.test_01"
data = [{"x": 1, "y": 2}]
df = spark.createDataFrame(data)

(
    df.write.format('delta')
    .mode("overwrite")
    .saveAsTable(table_directory)
    )

details = spark.sql(f"""DESCRIBE DETAIL workspace.db_bronze.test_01""")

display(details)

# showing some columns below; location is blank

 

formatidnamedescriptionlocation
deltasome_valueworkspace.db_bronze.test_01null 

Also, the following generates an error

df = spark.table("workspace.db_bronze.test_01")
input_files = df.inputFiles()

# print("Table files:")
for file_path in input_files:
     print(f"  {file_path}")

# Extract
if input_files:
    import os
    table_directory = os.path.dirname(input_files[0])
    # Simple directory listing - no recursion
    files = dbutils.fs.ls(table_directory)
    print(files)

# Error - Input path url 's3://url' overlaps with managed storage within 'ListFiles' call. .

Do I understand that, paths for managed tables (Unity Catalog creates a new directory in the Unity Catalog-configured storage location associated with the containing schema) are not accessible? Databricks Manged Paths 

 

1 ACCEPTED SOLUTION

Accepted Solutions

mnorland
Valued Contributor

That is correct.  Users cannot directly see the content in the managed paths for the underlying data files of a managed table in Unity Catalog. (_unitystorage subdirectory and below)

View solution in original post

1 REPLY 1

mnorland
Valued Contributor

That is correct.  Users cannot directly see the content in the managed paths for the underlying data files of a managed table in Unity Catalog. (_unitystorage subdirectory and below)