Hi @Mani2105,
if i create a table in the sales catalog without specifiying any external location, will the tables created be managed and will go to the Sales storage account
👉 Yes, if you create a table in the sales catalog without specifying any external location, this table will be automatically managed and the data will be stored in the default storage location configured for the sales catalog.
and in the event of deleting the table ,will it delete the files associated as welland auto optimize?
👉 Yes, in the case of managed tables within Unity Catalog in Databricks, deleting the table will also delete the associated files stored in the catalog’s storage location
Delta Lake’s auto-optimization features (like autoCompact and optimizeWrite) apply to managed tables. If you’ve enabled these settings, they will continuously optimize the storage layout, such as compacting small files or applying Z-ordering, to improve query performance and storage efficiency.
You can enable auto-optimization for the entire workspace, catalog, or individual tables using configuration settings:
# Enabling auto-optimization
spark.conf.set("spark.databricks.delta.autoCompact.enabled", "true")
spark.conf.set("spark.databricks.delta.optimizeWrite.enabled", "true")
What will my metastore USDATA have information about the sales catalog , if the sales catalog had a seperate storage location , its the metadata about the catalog goes to metastore?
👉 The USDATA metastore stores metadata for all catalogs in your workspace, including the Sales catalog. Although Sales has a separate storage location, only metadata about Sales (such as its tables, schemas, and storage path) is stored in the USDATA metastore—the actual data files reside in the storage location designated for Sales.
The metastore (USDATA) holds metadata about all catalogs, schemas, and tables within your Databricks workspace, including:
- Information about each catalog (e.g., Sales).
- Schemas (databases) within each catalog.
- Tables and views, including details such as columns, data types, and table properties.
- Access control configurations, permissions, and security settings.
Let me know if you’d like more details on any part of this process!
Regards!
Alfonso Gallardo
-------------------
I love working with tools like Databricks, Python, Azure, Microsoft Fabric, Azure Data Factory, and other Microsoft solutions, focusing on developing scalable and efficient solutions with Apache Spark