Get total size of data in a catalog and schema in Unity Catalog
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-01-2023 08:05 AM
For a KPI dashboard, we need to know the exact size of the data in a catalog and also all schemas inside the catalogs. What is the best way to do this?
We tried to iterate over all tables and sum the sizeInBytes using the DESCRIBE DETAIL command for the tables. However, since we have a lot of tables, it takes a really long time.
We also tried looking in the information_schema databases for all the catalogs but couldn't find such information there.
- Labels:
-
DESCRIBE DETAIL
-
Unity Catalog
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-08-2023 05:56 PM
@Anant Pingle : Please try using Databricks' Metadata API. This API provides programmatic access to metadata about Databricks objects such as tables, views, and databases.
from pyspark.sql.functions import sum
# Replace "my_catalog" with the name of your catalog
catalog_name = "my_catalog"
# Get a list of all tables in the catalog
tables = spark.catalog.listTables(catalog_name)
# Compute the size of each table and sum them up
total_size = sum([spark.table(table.database + "." + table.name).count() for table in tables])
print(f"The total size of {catalog_name} is {total_size} rows.")
Link to the API documentation: https://docs.databricks.com/dev-tools/api/latest/workspace.html
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-08-2023 07:09 PM
Hi @Anant Pingle
Thank you for posting your question in our community! We are happy to assist you.
To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?
This will also help other community members who may have similar questions in the future. Thank you for your participation and let us know if you need any further assistance!

