How do we manage data recency in Databricks

User16826994223
Databricks Employee
Databricks Employee

I want to know how databricks maintain data recency in databricks

sajith_appukutt
Databricks Employee
Databricks Employee

When using delta tables in databricks, you have the advantage of delta cache which accelerates data reads by creating copies of remote files in nodes’ local storage using a fast intermediate data format. At the beginning of each query delta tables auto-update to the latest version - this way data is always recent.

However, if  it is acceptable for results to be stale for a short duration of time, you could lower the latency of queries further. This is done by setting the Spark session configuration variable spark.databricks.delta.stalenessLimit with a time string value, e.g 1h, 15m, 1d