<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Measure size of all tables in Azure databricks in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/measure-size-of-all-tables-in-azure-databricks/m-p/69975#M33949</link>
    <description>&lt;P&gt;Hi Team,&lt;/P&gt;&lt;P&gt;Currently I am trying to find size of all tables in my&amp;nbsp;Azure databricks, as i am trying to get idea of current data loading trends, so i can plan for data forecast ( i.e. Last 2 months, approx 100 GB data came-in, so in next 2-3 months there should be 150 GB data coming-in)&lt;/P&gt;&lt;P&gt;My production Azure databricks environment is using Unity Catalog, which hosts:&lt;/P&gt;&lt;P&gt;a- All Bronze Tables&lt;/P&gt;&lt;P&gt;b- All Silver Tables&lt;/P&gt;&lt;P&gt;c- All Gold Tables.&lt;/P&gt;&lt;P&gt;d- Some extra Delta-Live-Tables, acting as Temp table holding results of intermediate calculation.&lt;/P&gt;&lt;P&gt;e- Some tables made via EXCEL sheet data.&lt;/P&gt;&lt;P&gt;Above tables are Delta-Live-Tables, made via DLT based pipelines/Jobs.&lt;/P&gt;&lt;P&gt;So i am looking for a script/code/solution which gives me total size in GB for all tables in given database.&lt;/P&gt;&lt;P&gt;Solution based on SQL is good one but even having answer based on Python/scala would be ok.&lt;/P&gt;&lt;P&gt;Also in tradition Relation-DBMS world, there used to be several built-in reports which gives idea of data loading trends via Charts or some graphs...So do we have such built-in feature with Azure Databricks ?&lt;/P&gt;&lt;P&gt;Thanks in Advance&lt;/P&gt;&lt;P&gt;Devsql&lt;/P&gt;</description>
    <pubDate>Mon, 20 May 2024 12:27:55 GMT</pubDate>
    <dc:creator>Devsql</dc:creator>
    <dc:date>2024-05-20T12:27:55Z</dc:date>
    <item>
      <title>Measure size of all tables in Azure databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/measure-size-of-all-tables-in-azure-databricks/m-p/69975#M33949</link>
      <description>&lt;P&gt;Hi Team,&lt;/P&gt;&lt;P&gt;Currently I am trying to find size of all tables in my&amp;nbsp;Azure databricks, as i am trying to get idea of current data loading trends, so i can plan for data forecast ( i.e. Last 2 months, approx 100 GB data came-in, so in next 2-3 months there should be 150 GB data coming-in)&lt;/P&gt;&lt;P&gt;My production Azure databricks environment is using Unity Catalog, which hosts:&lt;/P&gt;&lt;P&gt;a- All Bronze Tables&lt;/P&gt;&lt;P&gt;b- All Silver Tables&lt;/P&gt;&lt;P&gt;c- All Gold Tables.&lt;/P&gt;&lt;P&gt;d- Some extra Delta-Live-Tables, acting as Temp table holding results of intermediate calculation.&lt;/P&gt;&lt;P&gt;e- Some tables made via EXCEL sheet data.&lt;/P&gt;&lt;P&gt;Above tables are Delta-Live-Tables, made via DLT based pipelines/Jobs.&lt;/P&gt;&lt;P&gt;So i am looking for a script/code/solution which gives me total size in GB for all tables in given database.&lt;/P&gt;&lt;P&gt;Solution based on SQL is good one but even having answer based on Python/scala would be ok.&lt;/P&gt;&lt;P&gt;Also in tradition Relation-DBMS world, there used to be several built-in reports which gives idea of data loading trends via Charts or some graphs...So do we have such built-in feature with Azure Databricks ?&lt;/P&gt;&lt;P&gt;Thanks in Advance&lt;/P&gt;&lt;P&gt;Devsql&lt;/P&gt;</description>
      <pubDate>Mon, 20 May 2024 12:27:55 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/measure-size-of-all-tables-in-azure-databricks/m-p/69975#M33949</guid>
      <dc:creator>Devsql</dc:creator>
      <dc:date>2024-05-20T12:27:55Z</dc:date>
    </item>
    <item>
      <title>Re: Measure size of all tables in Azure databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/measure-size-of-all-tables-in-azure-databricks/m-p/75312#M34925</link>
      <description>&lt;P&gt;Hi &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/9"&gt;@Retired_mod&lt;/a&gt;,&lt;/P&gt;&lt;P&gt;1-&amp;nbsp; Regarding this issue i had found below link:&lt;/P&gt;&lt;P&gt;&lt;A href="https://kb.databricks.com/sql/find-size-of-table#:~:text=You%20can%20determine%20the%20size,stats%20to%20return%20the%20size" target="_blank"&gt;https://kb.databricks.com/sql/find-size-of-table#:~:text=You%20can%20determine%20the%20size,stats%20to%20return%20the%20size&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Now to try above link, I need to decide: Delta-Table Vs Non-Delta-Table.&lt;/P&gt;&lt;P&gt;Few of my tables are Materialized Views while some are Streaming Tables. So I am Not able to understand about which one to try: Delta-Table Vs Non-Delta-Table.&lt;/P&gt;&lt;P&gt;Can you please help me in that ?&lt;/P&gt;&lt;P&gt;2-&amp;nbsp; Also in above link, for Delta-Table, for below line of code:&lt;/P&gt;&lt;PRE&gt;DeltaLog.forTable(spark, "dbfs:/&amp;lt;path-to-delta-table&amp;gt;")&lt;/PRE&gt;&lt;P&gt;I do Not know path: "dbfs:/&amp;lt;path-to-delta-table&amp;gt;".&lt;/P&gt;&lt;P&gt;From where I can get this path, UI Or Command ?&lt;/P&gt;&lt;P&gt;As i am working for corporate company, Admin have given only necessary (minimum) access. So if you guide me than I can ask Admin for permission.&lt;/P&gt;&lt;P&gt;3- When I opened your above given link and tried your solution, I got below error:&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Cannot query a Streaming Table `DB`.`Schema`.`Table_append_raw` from an Assigned or No isolation shared cluster, please use a SHARED cluster or a Databricks SQL warehouse instead.&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;So from above I understood that we need to use &lt;SPAN&gt;SHARED cluster&lt;/SPAN&gt;. Is there any other option we have ?&lt;/P&gt;&lt;P&gt;Thank you&lt;/P&gt;&lt;P&gt;Devsql&lt;/P&gt;</description>
      <pubDate>Fri, 21 Jun 2024 09:47:58 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/measure-size-of-all-tables-in-azure-databricks/m-p/75312#M34925</guid>
      <dc:creator>Devsql</dc:creator>
      <dc:date>2024-06-21T09:47:58Z</dc:date>
    </item>
  </channel>
</rss>

