cancel
Showing results for 
Search instead for 
Did you mean: 
Malaysia
cancel
Showing results for 
Search instead for 
Did you mean: 

Tracking Delta Table Versions & Operations in Medallion Pipelines

ekhosravi
New Contributor II

In production-grade Medallion Architectures (Bronze → Silver → Gold), it’s critical to track how and when each Delta table changes.
To make this easier, I built a simple Databricks Python audit script that lists all tables in a schema and extracts their latest version, timestamp, and operation type (INSERT, UPDATE, MERGE, DELETE).

Code snippet:

for t in tables:
full_table = f"{schema}.{t}"
history_df = spark.sql(f"DESCRIBE HISTORY {full_table} LIMIT 1")
latest = history _ df. select ("version", "timestamp", "operation").collect()[0]
version_info.append((full_table, latest["version"], latest["timestamp"], latest["operation"]))

Why use this:

1.Quickly identify which (Bronze,Silver,Gold) tables were updated during the latest batch.
2.Detect version mismatches between upstream and downstream layers
3.Gain visibility into operation types (useful for debugging incremental loads)
4.Maintain auditability and data lineage across the ETL pipeline

This lightweight approach is a practical way to keep your Delta tables transparent, synchronized, and production-ready — no extra cost, just smart use of DESCRIBE HISTORY.

1761879916998.jpg

0 REPLIES 0

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now