โ10-19-2022 04:01 AM
let's suppose there is a database db, inside that so many tables are there and , i want to get the size of tables . how to get in either sql, python, pyspark.
even if i have to get one by one it's fine.
โ10-19-2022 10:54 AM
@Raman Guptaโ - could you please try the below
%python
spark.sql("describe detail delta-table-name").select("sizeInBytes").collect()
โ10-19-2022 06:01 AM
DESCRIBE DETAIL table_name returns the sizeInBytes
โ10-19-2022 07:58 AM
Describe detail give you only the size of the latest snapshot. It's worth running a dbutils.fs.ls
โ10-19-2022 09:04 AM
@Raman Guptaโ - Please refer to the below
Calculate the size of the Delta table:
%scala
import com.databricks.sql.transaction.tahoe._
val deltaLog = DeltaLog.forTable(spark, "dbfs:/delta-table-path")
val snapshot = deltaLog.snapshot // the current delta table snapshot
println(s"Total file size (bytes): ${deltaLog.snapshot.sizeInBytes}"
calculate the size of the non delta table:
%scala
spark.read.table("non-delta-table-name").queryExecution.analyzed.stats
โ10-19-2022 10:36 AM
Already know in scala.
I wanted this in either python or sql
scala link: https://kb.databricks.com/sql/find-size-of-table.html#:~:text=To%20find%20the%20size%20of%20a%20delt...
โ10-19-2022 10:54 AM
@Raman Guptaโ - could you please try the below
%python
spark.sql("describe detail delta-table-name").select("sizeInBytes").collect()
โ10-19-2022 10:26 PM
Thanks @Shanmugavel Chandrakasuโ
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group