10-19-2022 04:01 AM
let's suppose there is a database db, inside that so many tables are there and , i want to get the size of tables . how to get in either sql, python, pyspark.
even if i have to get one by one it's fine.
10-19-2022 10:54 AM
@Raman Gupta - could you please try the below
%python
spark.sql("describe detail delta-table-name").select("sizeInBytes").collect()
10-19-2022 06:01 AM
DESCRIBE DETAIL table_name returns the sizeInBytes
10-19-2022 07:58 AM
Describe detail give you only the size of the latest snapshot. It's worth running a dbutils.fs.ls
10-19-2022 09:04 AM
@Raman Gupta - Please refer to the below
Calculate the size of the Delta table:
%scala
import com.databricks.sql.transaction.tahoe._
val deltaLog = DeltaLog.forTable(spark, "dbfs:/delta-table-path")
val snapshot = deltaLog.snapshot // the current delta table snapshot
println(s"Total file size (bytes): ${deltaLog.snapshot.sizeInBytes}"
calculate the size of the non delta table:
%scala
spark.read.table("non-delta-table-name").queryExecution.analyzed.stats
10-19-2022 10:36 AM
Already know in scala.
I wanted this in either python or sql
scala link: https://kb.databricks.com/sql/find-size-of-table.html#:~:text=To%20find%20the%20size%20of%20a%20delt...
10-19-2022 10:54 AM
@Raman Gupta - could you please try the below
%python
spark.sql("describe detail delta-table-name").select("sizeInBytes").collect()
10-19-2022 10:26 PM
Thanks @Shanmugavel Chandrakasu
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group