<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Identifying Full Refresh vs. Incremental Runs in Delta Live Tables in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/identifying-full-refresh-vs-incremental-runs-in-delta-live/m-p/106754#M42575</link>
    <description>&lt;P&gt;Hello Community,&lt;/P&gt;&lt;P&gt;I am working with a &lt;STRONG&gt;Delta Live Tables (DLT) pipeline&lt;/STRONG&gt; that primarily operates in &lt;STRONG&gt;incremental mode&lt;/STRONG&gt;. However, there are specific scenarios where I need to perform a &lt;STRONG&gt;full refresh&lt;/STRONG&gt; of the pipeline. I am looking for an efficient and reliable way to determine, within the pipeline's Python codebase, whether it was triggered as a &lt;STRONG&gt;full refresh&lt;/STRONG&gt; or a &lt;STRONG&gt;normal incremental run&lt;/STRONG&gt;.&lt;/P&gt;&lt;H3&gt;My Requirements:&lt;/H3&gt;&lt;OL&gt;&lt;LI&gt;&lt;STRONG&gt;Dynamic Identification&lt;/STRONG&gt;: The solution should enable the code to dynamically identify the type of run (full refresh vs. incremental).&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Pipeline Configuration&lt;/STRONG&gt;: Ideally, this should be achieved by configuring something within the DLT pipeline, such as a parameter or flag.&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Accessing the Configuration&lt;/STRONG&gt;: The configuration should be accessible within the Python code during execution, allowing me to assign the information to variables for downstream logic.&lt;/LI&gt;&lt;/OL&gt;&lt;H3&gt;My Questions:&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;Is there an existing way in Databricks DLT to configure and identify the type of run?&lt;/LI&gt;&lt;LI&gt;Can the run type (full refresh vs. incremental) be passed as a parameter or stored in a metadata table that the pipeline can read?&lt;/LI&gt;&lt;LI&gt;Are there any best practices for handling such scenarios efficiently in DLT?&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Any guidance, examples, or insights from your experience would be greatly appreciated.&lt;/P&gt;&lt;P&gt;Thank you in advance for your support!&lt;/P&gt;</description>
    <pubDate>Thu, 23 Jan 2025 06:34:57 GMT</pubDate>
    <dc:creator>yvishal519</dc:creator>
    <dc:date>2025-01-23T06:34:57Z</dc:date>
    <item>
      <title>Identifying Full Refresh vs. Incremental Runs in Delta Live Tables</title>
      <link>https://community.databricks.com/t5/data-engineering/identifying-full-refresh-vs-incremental-runs-in-delta-live/m-p/106754#M42575</link>
      <description>&lt;P&gt;Hello Community,&lt;/P&gt;&lt;P&gt;I am working with a &lt;STRONG&gt;Delta Live Tables (DLT) pipeline&lt;/STRONG&gt; that primarily operates in &lt;STRONG&gt;incremental mode&lt;/STRONG&gt;. However, there are specific scenarios where I need to perform a &lt;STRONG&gt;full refresh&lt;/STRONG&gt; of the pipeline. I am looking for an efficient and reliable way to determine, within the pipeline's Python codebase, whether it was triggered as a &lt;STRONG&gt;full refresh&lt;/STRONG&gt; or a &lt;STRONG&gt;normal incremental run&lt;/STRONG&gt;.&lt;/P&gt;&lt;H3&gt;My Requirements:&lt;/H3&gt;&lt;OL&gt;&lt;LI&gt;&lt;STRONG&gt;Dynamic Identification&lt;/STRONG&gt;: The solution should enable the code to dynamically identify the type of run (full refresh vs. incremental).&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Pipeline Configuration&lt;/STRONG&gt;: Ideally, this should be achieved by configuring something within the DLT pipeline, such as a parameter or flag.&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Accessing the Configuration&lt;/STRONG&gt;: The configuration should be accessible within the Python code during execution, allowing me to assign the information to variables for downstream logic.&lt;/LI&gt;&lt;/OL&gt;&lt;H3&gt;My Questions:&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;Is there an existing way in Databricks DLT to configure and identify the type of run?&lt;/LI&gt;&lt;LI&gt;Can the run type (full refresh vs. incremental) be passed as a parameter or stored in a metadata table that the pipeline can read?&lt;/LI&gt;&lt;LI&gt;Are there any best practices for handling such scenarios efficiently in DLT?&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Any guidance, examples, or insights from your experience would be greatly appreciated.&lt;/P&gt;&lt;P&gt;Thank you in advance for your support!&lt;/P&gt;</description>
      <pubDate>Thu, 23 Jan 2025 06:34:57 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/identifying-full-refresh-vs-incremental-runs-in-delta-live/m-p/106754#M42575</guid>
      <dc:creator>yvishal519</dc:creator>
      <dc:date>2025-01-23T06:34:57Z</dc:date>
    </item>
    <item>
      <title>Re: Identifying Full Refresh vs. Incremental Runs in Delta Live Tables</title>
      <link>https://community.databricks.com/t5/data-engineering/identifying-full-refresh-vs-incremental-runs-in-delta-live/m-p/106998#M42671</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;There are two ways to determine whether a DLT pipeline is running in Full Refresh or Incremental mode:&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;&lt;P&gt;&lt;STRONG&gt;DLT Event Log Schema&lt;/STRONG&gt;&lt;BR /&gt;The details column in the DLT event log schema includes information on "full_refresh". You can use this to identify whether it is True or False.&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.databricks.com/ja/delta-live-tables/observability.html#event-log-schema" target="_blank" rel="noopener"&gt;DLT Event Log Schema Documentation&lt;/A&gt;&lt;/P&gt;&lt;P&gt;An example of the details column is as follows:&lt;/P&gt;&lt;PRE&gt;{"user_action":{"action":"START","user_name":"xxxxxxx@gmail.com","user_id":xxxxxxxx,"request":{"start_request":{"full_refresh":false,"validate_only":false}}}}&lt;/PRE&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;&lt;STRONG&gt;Databricks REST API&lt;/STRONG&gt;&lt;BR /&gt;You can retrieve DLT pipeline information using the Databricks REST API, which also contains the "full_refresh" field. Here, you can check whether it is True or False.&lt;/P&gt;&lt;P&gt;Since you can invoke the Databricks REST API from Python, this might help you achieve what you’re aiming for.&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.databricks.com/api/workspace/pipelines/getupdate" target="_blank" rel="noopener"&gt;Databricks REST API Documentation - Get Pipeline Update&lt;/A&gt;&lt;/P&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;I hope this helps!&lt;/P&gt;</description>
      <pubDate>Sat, 25 Jan 2025 10:22:13 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/identifying-full-refresh-vs-incremental-runs-in-delta-live/m-p/106998#M42671</guid>
      <dc:creator>Takuya-Omi</dc:creator>
      <dc:date>2025-01-25T10:22:13Z</dc:date>
    </item>
    <item>
      <title>Re: Identifying Full Refresh vs. Incremental Runs in Delta Live Tables</title>
      <link>https://community.databricks.com/t5/data-engineering/identifying-full-refresh-vs-incremental-runs-in-delta-live/m-p/131539#M49126</link>
      <description>&lt;P&gt;How do I get that into the notebook. When I click on Full refresh , I want a particular column in pipeline table to capture that saying "Full Refresh on &amp;lt;timestamp&amp;gt;.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 10 Sep 2025 14:13:05 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/identifying-full-refresh-vs-incremental-runs-in-delta-live/m-p/131539#M49126</guid>
      <dc:creator>km1837</dc:creator>
      <dc:date>2025-09-10T14:13:05Z</dc:date>
    </item>
  </channel>
</rss>

