<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Delta Live Tables: reading from output in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/delta-live-tables-reading-from-output/m-p/12491#M7291</link>
    <description>&lt;P&gt;Hi @Chris Nawara​, &lt;/P&gt;&lt;P&gt;I had the same issue you had. I was trying to avoid the apply_changes but we in the end I implemented it and I'm happier that I expected hehe&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;and if you have any additional standardization columns that you need to implement, you can simply read from the apply_changes table and generate the final table.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;My logic is like that&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;readStream -&amp;gt; dlt.view based on the dataframe -&amp;gt; dlt.create_streaming_live_table -&amp;gt; dlt.apply_changes (stored_as_scd_type=2) -&amp;gt; dlt.table (I had to create an additional table because I have few columns to calculatated based on the __START_AT and __END_AT provided by the apply_changes)&lt;/P&gt;</description>
    <pubDate>Mon, 23 Jan 2023 14:29:29 GMT</pubDate>
    <dc:creator>fecavalc08</dc:creator>
    <dc:date>2023-01-23T14:29:29Z</dc:date>
    <item>
      <title>Delta Live Tables: reading from output</title>
      <link>https://community.databricks.com/t5/data-engineering/delta-live-tables-reading-from-output/m-p/12490#M7290</link>
      <description>&lt;P&gt;I'm trying to implement an incremental ingestion logic in the following way:&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;database tables have DbUpdatedDate column&lt;/LI&gt;&lt;LI&gt;During initial load I perform a full copy of the database table&lt;/LI&gt;&lt;LI&gt;During incremental load I:&lt;OL&gt;&lt;LI&gt;scan the data already in the DLT to see what is the most recent DbUpdatedDate that we already have (let's call it a high_watermark)&lt;/LI&gt;&lt;LI&gt;query database table to only fetch data with DbUpdatedDate &amp;gt; high_watermark&lt;/LI&gt;&lt;LI&gt;I perform unionByName on historized data and the new increment&lt;/LI&gt;&lt;/OL&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;3.1 is where I'm having issues - when I try to read from the output, I keep getting errors.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The minimal example to reproduce it:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;import dlt
&amp;nbsp;
INITIAL_RUN = True
&amp;nbsp;
@dlt.table
def test_table():
    if INITIAL_RUN:
        return spark.createDataFrame([
            {"id": 1, "val": "1"},
            {"id": 2, "val": "2"},
        ])
    else:
        dlt.read("test_table")
        
@dlt.table
def test_table_copy():
    df = dlt.read("test_table")
    print(df.collect())
    return df&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;When INITIAL_RUN is True, everything works fine. But after I flip it to False (having run it beforehand, so the tables exist) I get the following error:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;pyspark.sql.utils.AnalysisException: Failed to read dataset 'test_table'. Dataset is defined in the pipeline but could not be resolved.&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Same thing happens when I try to use &lt;I&gt;spark.table("LIVE.test_table")&lt;/I&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Is reading from the output a supported scenario? If not how could I work around this?&lt;/P&gt;</description>
      <pubDate>Wed, 11 Jan 2023 19:12:32 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delta-live-tables-reading-from-output/m-p/12490#M7290</guid>
      <dc:creator>knawara</dc:creator>
      <dc:date>2023-01-11T19:12:32Z</dc:date>
    </item>
    <item>
      <title>Re: Delta Live Tables: reading from output</title>
      <link>https://community.databricks.com/t5/data-engineering/delta-live-tables-reading-from-output/m-p/12491#M7291</link>
      <description>&lt;P&gt;Hi @Chris Nawara​, &lt;/P&gt;&lt;P&gt;I had the same issue you had. I was trying to avoid the apply_changes but we in the end I implemented it and I'm happier that I expected hehe&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;and if you have any additional standardization columns that you need to implement, you can simply read from the apply_changes table and generate the final table.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;My logic is like that&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;readStream -&amp;gt; dlt.view based on the dataframe -&amp;gt; dlt.create_streaming_live_table -&amp;gt; dlt.apply_changes (stored_as_scd_type=2) -&amp;gt; dlt.table (I had to create an additional table because I have few columns to calculatated based on the __START_AT and __END_AT provided by the apply_changes)&lt;/P&gt;</description>
      <pubDate>Mon, 23 Jan 2023 14:29:29 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delta-live-tables-reading-from-output/m-p/12491#M7291</guid>
      <dc:creator>fecavalc08</dc:creator>
      <dc:date>2023-01-23T14:29:29Z</dc:date>
    </item>
    <item>
      <title>Re: Delta Live Tables: reading from output</title>
      <link>https://community.databricks.com/t5/data-engineering/delta-live-tables-reading-from-output/m-p/12492#M7292</link>
      <description>&lt;P&gt;Hi!&lt;/P&gt;&lt;P&gt;@Felipe Cavalcante​&amp;nbsp;are you querying the database directly? Or you have a CDC stream in e.g. Kafka? To put it differently - where is the first readStream reading data from?&lt;/P&gt;&lt;P&gt;Best regards,&lt;/P&gt;&lt;P&gt;Chris&lt;/P&gt;</description>
      <pubDate>Mon, 23 Jan 2023 15:12:00 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delta-live-tables-reading-from-output/m-p/12492#M7292</guid>
      <dc:creator>knawara</dc:creator>
      <dc:date>2023-01-23T15:12:00Z</dc:date>
    </item>
    <item>
      <title>Re: Delta Live Tables: reading from output</title>
      <link>https://community.databricks.com/t5/data-engineering/delta-live-tables-reading-from-output/m-p/12493#M7293</link>
      <description>&lt;P&gt;Hi @Chris Nawara​&amp;nbsp;, we read from an adls.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;BR&lt;/P&gt;</description>
      <pubDate>Tue, 31 Jan 2023 12:51:37 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delta-live-tables-reading-from-output/m-p/12493#M7293</guid>
      <dc:creator>fecavalc08</dc:creator>
      <dc:date>2023-01-31T12:51:37Z</dc:date>
    </item>
    <item>
      <title>Re: Delta Live Tables: reading from output</title>
      <link>https://community.databricks.com/t5/data-engineering/delta-live-tables-reading-from-output/m-p/12494#M7294</link>
      <description>&lt;P&gt;HI @Felipe Cavalcante​! In my usecase I want to read from a database table, so I guess if you're reading from ADLS location that's a different case &lt;/P&gt;</description>
      <pubDate>Wed, 15 Feb 2023 11:58:13 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delta-live-tables-reading-from-output/m-p/12494#M7294</guid>
      <dc:creator>knawara</dc:creator>
      <dc:date>2023-02-15T11:58:13Z</dc:date>
    </item>
  </channel>
</rss>

