<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: First Lakeflow (DLT) Pipeline Best Practice Question in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/first-lakeflow-dlt-pipeline-best-practice-question/m-p/130089#M48693</link>
    <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/128195"&gt;@mtreigelman&lt;/a&gt;thanks for providing the update.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;If you wouldn't mind, could you explain why you think the first way didn't work and why the second way did? Then you can mark your response as a solution to the question &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;.&lt;BR /&gt;&lt;BR /&gt;I found this article to be useful for joins with streaming tables:&amp;nbsp;&lt;A href="https://docs.databricks.com/aws/en/transform/join" target="_blank"&gt;https://docs.databricks.com/aws/en/transform/join&lt;/A&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="BS_THE_ANALYST_0-1756409434209.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/19459i4A10CDEAAC7825A1/image-size/medium?v=v2&amp;amp;px=400" role="button" title="BS_THE_ANALYST_0-1756409434209.png" alt="BS_THE_ANALYST_0-1756409434209.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;There's some nice info to branch out to on there i.e. Stream-Static joins.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;All the best,&lt;BR /&gt;BS&lt;/P&gt;</description>
    <pubDate>Thu, 28 Aug 2025 19:33:26 GMT</pubDate>
    <dc:creator>BS_THE_ANALYST</dc:creator>
    <dc:date>2025-08-28T19:33:26Z</dc:date>
    <item>
      <title>First Lakeflow (DLT) Pipeline Best Practice Question</title>
      <link>https://community.databricks.com/t5/data-engineering/first-lakeflow-dlt-pipeline-best-practice-question/m-p/129925#M48643</link>
      <description>&lt;P&gt;Hi, I am writing my first streaming pipeline and trying to ensure it is setup to work as a "Lakeflow" pipeline.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;It is connecting an external Oracle database with some external Azure Blob storage data (all managed in the same Unity Catalog). The pipeline is just a simple join, to create a gold level table for analysts to use.&amp;nbsp;&lt;/P&gt;&lt;P&gt;My first attempt at this looks like:&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;%sql
CREATE OR REPLACE STREAMING TABLE final_streaming_table_name
USING DELTA
AS
SELECT *final_cols*
FROM oracle_table_1 o1
  JOIN (SELECT *some_cols* FROM azure_table) a ON a.join_col = o1.join_col_1
  JOIN (SELECT *some_cols* FROM oracle_table_2) o2 ON o2.join_col = o1.join_col_2&lt;/LI-CODE&gt;&lt;P&gt;But I also saw some other resources saying that I need to specify JOIN STREAM like this:&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;%sql
CREATE OR REPLACE STREAMING TABLE final_streaming_table_name
USING DELTA
AS
SELECT *final_cols*
FROM STREAM(azure_table) a
JOIN STREAM(oracle_table_1) o1 ON o1.join_col_1 = a.join_col
JOIN STREAM(oracle_table_2) o2 ON o2.join_col = o1.join_col_2&lt;/LI-CODE&gt;&lt;P&gt;The external Oracle databases have no regular update schedule, and the underlying values can change at any time. I would always like my `final_streaming_table_name` to have the freshest values.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRIKE&gt;Is one way right? Or would they both work, and if both work what are the benefits or downsides to using a particular strategy?&lt;/STRIKE&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;After experimentation, it appears you cannot do it the first way&amp;nbsp;&lt;span class="lia-unicode-emoji" title=":grinning_face_with_sweat:"&gt;😅&lt;/span&gt;&amp;nbsp;Consider this CLOSED.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 27 Aug 2025 16:26:39 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/first-lakeflow-dlt-pipeline-best-practice-question/m-p/129925#M48643</guid>
      <dc:creator>mtreigelman</dc:creator>
      <dc:date>2025-08-27T16:26:39Z</dc:date>
    </item>
    <item>
      <title>Re: First Lakeflow (DLT) Pipeline Best Practice Question</title>
      <link>https://community.databricks.com/t5/data-engineering/first-lakeflow-dlt-pipeline-best-practice-question/m-p/130089#M48693</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/128195"&gt;@mtreigelman&lt;/a&gt;thanks for providing the update.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;If you wouldn't mind, could you explain why you think the first way didn't work and why the second way did? Then you can mark your response as a solution to the question &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;.&lt;BR /&gt;&lt;BR /&gt;I found this article to be useful for joins with streaming tables:&amp;nbsp;&lt;A href="https://docs.databricks.com/aws/en/transform/join" target="_blank"&gt;https://docs.databricks.com/aws/en/transform/join&lt;/A&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="BS_THE_ANALYST_0-1756409434209.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/19459i4A10CDEAAC7825A1/image-size/medium?v=v2&amp;amp;px=400" role="button" title="BS_THE_ANALYST_0-1756409434209.png" alt="BS_THE_ANALYST_0-1756409434209.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;There's some nice info to branch out to on there i.e. Stream-Static joins.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;All the best,&lt;BR /&gt;BS&lt;/P&gt;</description>
      <pubDate>Thu, 28 Aug 2025 19:33:26 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/first-lakeflow-dlt-pipeline-best-practice-question/m-p/130089#M48693</guid>
      <dc:creator>BS_THE_ANALYST</dc:creator>
      <dc:date>2025-08-28T19:33:26Z</dc:date>
    </item>
  </channel>
</rss>

