<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How do we manage data recency in Databricks in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/how-do-we-manage-data-recency-in-databricks/m-p/22061#M15079</link>
    <description>&lt;P&gt;When using delta tables in databricks, you have the advantage of delta cache which accelerates data reads by creating copies of remote files in nodes’ local storage using a fast intermediate data format.&amp;nbsp;At the beginning of each query delta tables auto-update to the latest version - this way data is always recent. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;However, if &amp;nbsp;it is acceptable for results to be  stale for a short duration of time, you could lower the latency of queries further. This is done by setting the Spark session configuration variable spark.databricks.delta.stalenessLimit with a time string value, e.g 1h, 15m, 1d&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 23 Jun 2021 00:43:42 GMT</pubDate>
    <dc:creator>sajith_appukutt</dc:creator>
    <dc:date>2021-06-23T00:43:42Z</dc:date>
    <item>
      <title>How do we manage data recency in Databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/how-do-we-manage-data-recency-in-databricks/m-p/22060#M15078</link>
      <description>&lt;P&gt;I want to know how databricks maintain data recency in databricks &lt;/P&gt;</description>
      <pubDate>Mon, 21 Jun 2021 12:57:04 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-do-we-manage-data-recency-in-databricks/m-p/22060#M15078</guid>
      <dc:creator>User16826994223</dc:creator>
      <dc:date>2021-06-21T12:57:04Z</dc:date>
    </item>
    <item>
      <title>Re: How do we manage data recency in Databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/how-do-we-manage-data-recency-in-databricks/m-p/22061#M15079</link>
      <description>&lt;P&gt;When using delta tables in databricks, you have the advantage of delta cache which accelerates data reads by creating copies of remote files in nodes’ local storage using a fast intermediate data format.&amp;nbsp;At the beginning of each query delta tables auto-update to the latest version - this way data is always recent. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;However, if &amp;nbsp;it is acceptable for results to be  stale for a short duration of time, you could lower the latency of queries further. This is done by setting the Spark session configuration variable spark.databricks.delta.stalenessLimit with a time string value, e.g 1h, 15m, 1d&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 23 Jun 2021 00:43:42 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-do-we-manage-data-recency-in-databricks/m-p/22061#M15079</guid>
      <dc:creator>sajith_appukutt</dc:creator>
      <dc:date>2021-06-23T00:43:42Z</dc:date>
    </item>
  </channel>
</rss>

