<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Backfill Delta table in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/backfill-delta-table/m-p/26218#M18327</link>
    <description>Another approach you might consider is creating a template notebook to query a known date range with widgets. For example, two date widgets, start time and end time. Then from there you could use Databricks Jobs to update these parameters for each run, and this way it will spin up a cluster for each date range, and you could run all of those clusters in parallel as well.</description>
    <pubDate>Mon, 07 Jun 2021 14:10:00 GMT</pubDate>
    <dc:creator>User16783855117</dc:creator>
    <dc:date>2021-06-07T14:10:00Z</dc:date>
    <item>
      <title>Backfill Delta table</title>
      <link>https://community.databricks.com/t5/data-engineering/backfill-delta-table/m-p/26217#M18326</link>
      <description>&lt;P&gt;What is the recommended way to backfill a delta table using a series of smaller date partitioned jobs?&lt;/P&gt;</description>
      <pubDate>Fri, 04 Jun 2021 19:38:34 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/backfill-delta-table/m-p/26217#M18326</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2021-06-04T19:38:34Z</dc:date>
    </item>
    <item>
      <title>Re: Backfill Delta table</title>
      <link>https://community.databricks.com/t5/data-engineering/backfill-delta-table/m-p/26218#M18327</link>
      <description>Another approach you might consider is creating a template notebook to query a known date range with widgets. For example, two date widgets, start time and end time. Then from there you could use Databricks Jobs to update these parameters for each run, and this way it will spin up a cluster for each date range, and you could run all of those clusters in parallel as well.</description>
      <pubDate>Mon, 07 Jun 2021 14:10:00 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/backfill-delta-table/m-p/26218#M18327</guid>
      <dc:creator>User16783855117</dc:creator>
      <dc:date>2021-06-07T14:10:00Z</dc:date>
    </item>
  </channel>
</rss>

