<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Best practice for creating SQL views on top of continuously running Spark Structured Streaming jobs in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/best-practice-for-creating-sql-views-on-top-of-continuously/m-p/156187#M54382</link>
    <description>&lt;P&gt;I am working with a continuously running Spark Structured Streaming job in Databricks, deployed as a standalone job using continuous trigger mode via Databricks Asset Bundles (DABs).&lt;/P&gt;&lt;P&gt;On top of the streaming output table (created via writeStream), I want to define a SQL view. However, I am unsure about the best practice for handling this in a CI/CD-friendly way.&lt;/P&gt;&lt;P&gt;The core challenge is that the streaming job is designed to run continuously and therefore never reaches a terminal “success” state. Because of this, it cannot easily be orchestrated within a multi-task job where a downstream notebook task depends on its successful completion to create the view.&lt;/P&gt;&lt;P&gt;I have considered a few possible approaches:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;Pre-defining the table and view in a separate notebook task&lt;/STRONG&gt; that the streaming job depends on. This works, but it requires manual schema management, whereas ideally I would like Spark to infer and manage the schema automatically when creating the table via writeStream.&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Creating a separate job/notebook that waits for the table to exist and then creates the view&lt;/STRONG&gt;, potentially using retry logic or a polling loop. However, since Databricks jobs do not support a true “run once after deployment” pattern in a clean way, this approach feels fragile.&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Triggering a post-deployment step via the Databricks CLI&lt;/STRONG&gt; to run a job that creates the view after deployment. While viable, this would require changes to the existing CI/CD pipeline, which I would prefer to avoid.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;What is the recommended or most elegant way to handle this pattern in Databricks when working with continuously running streaming jobs and downstream SQL views in a CI/CD setup using DABs?&lt;/P&gt;</description>
    <pubDate>Tue, 05 May 2026 17:55:41 GMT</pubDate>
    <dc:creator>mnissen1337</dc:creator>
    <dc:date>2026-05-05T17:55:41Z</dc:date>
    <item>
      <title>Best practice for creating SQL views on top of continuously running Spark Structured Streaming jobs</title>
      <link>https://community.databricks.com/t5/data-engineering/best-practice-for-creating-sql-views-on-top-of-continuously/m-p/156187#M54382</link>
      <description>&lt;P&gt;I am working with a continuously running Spark Structured Streaming job in Databricks, deployed as a standalone job using continuous trigger mode via Databricks Asset Bundles (DABs).&lt;/P&gt;&lt;P&gt;On top of the streaming output table (created via writeStream), I want to define a SQL view. However, I am unsure about the best practice for handling this in a CI/CD-friendly way.&lt;/P&gt;&lt;P&gt;The core challenge is that the streaming job is designed to run continuously and therefore never reaches a terminal “success” state. Because of this, it cannot easily be orchestrated within a multi-task job where a downstream notebook task depends on its successful completion to create the view.&lt;/P&gt;&lt;P&gt;I have considered a few possible approaches:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;Pre-defining the table and view in a separate notebook task&lt;/STRONG&gt; that the streaming job depends on. This works, but it requires manual schema management, whereas ideally I would like Spark to infer and manage the schema automatically when creating the table via writeStream.&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Creating a separate job/notebook that waits for the table to exist and then creates the view&lt;/STRONG&gt;, potentially using retry logic or a polling loop. However, since Databricks jobs do not support a true “run once after deployment” pattern in a clean way, this approach feels fragile.&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Triggering a post-deployment step via the Databricks CLI&lt;/STRONG&gt; to run a job that creates the view after deployment. While viable, this would require changes to the existing CI/CD pipeline, which I would prefer to avoid.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;What is the recommended or most elegant way to handle this pattern in Databricks when working with continuously running streaming jobs and downstream SQL views in a CI/CD setup using DABs?&lt;/P&gt;</description>
      <pubDate>Tue, 05 May 2026 17:55:41 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/best-practice-for-creating-sql-views-on-top-of-continuously/m-p/156187#M54382</guid>
      <dc:creator>mnissen1337</dc:creator>
      <dc:date>2026-05-05T17:55:41Z</dc:date>
    </item>
  </channel>
</rss>

