<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: DLT Pipeline issue - Failed to read dataset .Dataset is not defined in the pipeline. in Get Started Discussions</title>
    <link>https://community.databricks.com/t5/get-started-discussions/dlt-pipeline-issue-failed-to-read-dataset-dataset-is-not-defined/m-p/55805#M2066</link>
    <description>&lt;P&gt;My Notebook is like below and there are 5 such by varying table.&amp;nbsp; I had this earlier in seperate notebook which ran very well . But now i merged it up with Bronze layer notebook and i ran into issues&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/97035"&gt;@Dlt&lt;/a&gt;.table(&lt;BR /&gt;name="Temp Table",&lt;BR /&gt;table_properties={"quality" : "silver"},&lt;BR /&gt;Temporary=True&lt;BR /&gt;)&lt;BR /&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/97035"&gt;@Dlt&lt;/a&gt;.expect_all(rules)&lt;BR /&gt;def Temp_Table():&lt;BR /&gt;return (&lt;BR /&gt;spark.sql("SELECT * FROM bronze_layer_table")&lt;BR /&gt;.withColumn("is_bad_data", expr(quarantine_rules)))&lt;BR /&gt;&lt;BR /&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/97035"&gt;@Dlt&lt;/a&gt;.table(&lt;BR /&gt;name="Clean Table",&lt;BR /&gt;table_properties={"quality" : "silver"}&lt;BR /&gt;)&lt;BR /&gt;def get_clean_data():&lt;BR /&gt;return (&lt;BR /&gt;dlt.read("Temp Table")&lt;BR /&gt;.filter("is_bad_data=false")&lt;BR /&gt;)&lt;BR /&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/97035"&gt;@Dlt&lt;/a&gt;.table(&lt;BR /&gt;name="Bad Data",&lt;BR /&gt;table_properties={"quality" : "silver"}&lt;BR /&gt;)&lt;/P&gt;&lt;P&gt;def get_bad_data():&lt;BR /&gt;return (&lt;BR /&gt;dlt.read("Temp Table")&lt;BR /&gt;.filter("is_bad_data=true")&lt;BR /&gt;)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Error&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Failed to resolve flow due to upstream failure.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Failed to read dataset 'Temp Table'. Dataset is not defined in the pipeline.&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 27 Dec 2023 13:54:28 GMT</pubDate>
    <dc:creator>Dlt</dc:creator>
    <dc:date>2023-12-27T13:54:28Z</dc:date>
    <item>
      <title>DLT Pipeline issue - Failed to read dataset .Dataset is not defined in the pipeline.</title>
      <link>https://community.databricks.com/t5/get-started-discussions/dlt-pipeline-issue-failed-to-read-dataset-dataset-is-not-defined/m-p/55779#M2061</link>
      <description>&lt;P&gt;Background.&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have created a DLT pipeline in which i am creating a Temorary table.&amp;nbsp; There are 5 temporary tables as such.&amp;nbsp; When i executed these in an independent notebook they all worked fine with DLT.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Now i have merged this notebook ( keeping same exact code) with other bronze layer notebook&lt;/P&gt;&lt;P&gt;Each temporary table is in seperate cell.&amp;nbsp; But with this consolidated notebook i am getting above mentioned error.&amp;nbsp;&lt;/P&gt;&lt;P&gt;So NOT sure what is issue here , if a DLT code which worked independently earlier why it would fail when combined with other bronze layer code.&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This is happing due to a feature issue with DLT where we can add multiple notebook but cannot setup a sequence&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Wed, 27 Dec 2023 09:42:06 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/dlt-pipeline-issue-failed-to-read-dataset-dataset-is-not-defined/m-p/55779#M2061</guid>
      <dc:creator>Dlt</dc:creator>
      <dc:date>2023-12-27T09:42:06Z</dc:date>
    </item>
    <item>
      <title>Re: DLT Pipeline issue - Failed to read dataset .Dataset is not defined in the pipeline.</title>
      <link>https://community.databricks.com/t5/get-started-discussions/dlt-pipeline-issue-failed-to-read-dataset-dataset-is-not-defined/m-p/55795#M2062</link>
      <description>&lt;P&gt;Please uses the workflow and jobs option and associate the respective notebooks with respective to the job in order to enable the sequential process.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.databricks.com/en/workflows/jobs/create-run-jobs.html" target="_blank"&gt;https://docs.databricks.com/en/workflows/jobs/create-run-jobs.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;This functionality is also avaiable for DLT tables as well though I have not used in DLT tables.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 27 Dec 2023 13:01:15 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/dlt-pipeline-issue-failed-to-read-dataset-dataset-is-not-defined/m-p/55795#M2062</guid>
      <dc:creator>BR_DatabricksAI</dc:creator>
      <dc:date>2023-12-27T13:01:15Z</dc:date>
    </item>
    <item>
      <title>Re: DLT Pipeline issue - Failed to read dataset .Dataset is not defined in the pipeline.</title>
      <link>https://community.databricks.com/t5/get-started-discussions/dlt-pipeline-issue-failed-to-read-dataset-dataset-is-not-defined/m-p/55797#M2063</link>
      <description>&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;P&gt;You probably messed up your code, or alternatively, the runtime has been upgraded, and it stopped working.&lt;/P&gt;&lt;P&gt;DLT is 100% declarative and never runs in any kind of sequence; instead, it is figuring out dependencies between tables and setting the execution DAG (putting code in separate notebooks is just a way of keeping your code clean).&lt;/P&gt;&lt;P&gt;Maybe you can attach your notebook and a screenshot of the error in the DLT pipeline.&lt;/P&gt;&lt;P&gt;There is also another thing you can try:&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;Create a new DLT pipeline.&lt;/LI&gt;&lt;LI&gt;Target a new schema.&lt;/LI&gt;&lt;LI&gt;Put your final bronze notebook.&lt;/LI&gt;&lt;LI&gt;Run.&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;If it runs okay, there is a chance that DLT bugged out.&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Wed, 27 Dec 2023 13:09:46 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/dlt-pipeline-issue-failed-to-read-dataset-dataset-is-not-defined/m-p/55797#M2063</guid>
      <dc:creator>Wojciech_BUK</dc:creator>
      <dc:date>2023-12-27T13:09:46Z</dc:date>
    </item>
    <item>
      <title>Re: DLT Pipeline issue - Failed to read dataset .Dataset is not defined in the pipeline.</title>
      <link>https://community.databricks.com/t5/get-started-discussions/dlt-pipeline-issue-failed-to-read-dataset-dataset-is-not-defined/m-p/55805#M2066</link>
      <description>&lt;P&gt;My Notebook is like below and there are 5 such by varying table.&amp;nbsp; I had this earlier in seperate notebook which ran very well . But now i merged it up with Bronze layer notebook and i ran into issues&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/97035"&gt;@Dlt&lt;/a&gt;.table(&lt;BR /&gt;name="Temp Table",&lt;BR /&gt;table_properties={"quality" : "silver"},&lt;BR /&gt;Temporary=True&lt;BR /&gt;)&lt;BR /&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/97035"&gt;@Dlt&lt;/a&gt;.expect_all(rules)&lt;BR /&gt;def Temp_Table():&lt;BR /&gt;return (&lt;BR /&gt;spark.sql("SELECT * FROM bronze_layer_table")&lt;BR /&gt;.withColumn("is_bad_data", expr(quarantine_rules)))&lt;BR /&gt;&lt;BR /&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/97035"&gt;@Dlt&lt;/a&gt;.table(&lt;BR /&gt;name="Clean Table",&lt;BR /&gt;table_properties={"quality" : "silver"}&lt;BR /&gt;)&lt;BR /&gt;def get_clean_data():&lt;BR /&gt;return (&lt;BR /&gt;dlt.read("Temp Table")&lt;BR /&gt;.filter("is_bad_data=false")&lt;BR /&gt;)&lt;BR /&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/97035"&gt;@Dlt&lt;/a&gt;.table(&lt;BR /&gt;name="Bad Data",&lt;BR /&gt;table_properties={"quality" : "silver"}&lt;BR /&gt;)&lt;/P&gt;&lt;P&gt;def get_bad_data():&lt;BR /&gt;return (&lt;BR /&gt;dlt.read("Temp Table")&lt;BR /&gt;.filter("is_bad_data=true")&lt;BR /&gt;)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Error&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Failed to resolve flow due to upstream failure.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Failed to read dataset 'Temp Table'. Dataset is not defined in the pipeline.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 27 Dec 2023 13:54:28 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/dlt-pipeline-issue-failed-to-read-dataset-dataset-is-not-defined/m-p/55805#M2066</guid>
      <dc:creator>Dlt</dc:creator>
      <dc:date>2023-12-27T13:54:28Z</dc:date>
    </item>
    <item>
      <title>Re: DLT Pipeline issue - Failed to read dataset .Dataset is not defined in the pipeline.</title>
      <link>https://community.databricks.com/t5/get-started-discussions/dlt-pipeline-issue-failed-to-read-dataset-dataset-is-not-defined/m-p/55815#M2070</link>
      <description>&lt;P&gt;I don;t belive your code was working before, at least the one you pasted above that has spacebars in table names as DLT throw me errors that it could not register it.&lt;BR /&gt;I made some correction to code and and it works ok.&lt;/P&gt;&lt;P&gt;I have added underscored "_"&amp;nbsp; to table names, changed decorators "@Dlt" to "@dlt" and changed "&lt;SPAN&gt;Temporary" to "temporary"&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;I had to drop your expectations and fake one column with static value.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Wojciech_BUK_0-1703689873573.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/5675i669662932D3D6C26/image-size/medium/is-moderation-mode/true?v=v2&amp;amp;px=400" role="button" title="Wojciech_BUK_0-1703689873573.png" alt="Wojciech_BUK_0-1703689873573.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;import dlt

from pyspark.sql.functions import lit

@dlt.table(
name="Temp_Table",
table_properties={"quality" : "silver"},
temporary=True
)
def Temp_Table():
    return (
        spark.sql("SELECT * FROM priv_wojciech_bukowski.dss_gold.dim_dss_date")
        .withColumn("is_bad_data", lit('xxx'))
)


@dlt.table(
name="Clean_Table",
table_properties={"quality" : "silver"}
)
def get_clean_data():
    return (
        dlt.read("Temp_Table")
        .filter("is_bad_data='xxx'")
)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 27 Dec 2023 15:12:03 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/dlt-pipeline-issue-failed-to-read-dataset-dataset-is-not-defined/m-p/55815#M2070</guid>
      <dc:creator>Wojciech_BUK</dc:creator>
      <dc:date>2023-12-27T15:12:03Z</dc:date>
    </item>
    <item>
      <title>Re: DLT Pipeline issue - Failed to read dataset .Dataset is not defined in the pipeline.</title>
      <link>https://community.databricks.com/t5/get-started-discussions/dlt-pipeline-issue-failed-to-read-dataset-dataset-is-not-defined/m-p/55828#M2071</link>
      <description>&lt;P&gt;Please note code had worked earlier when I was running it via seperate notebook , these errors are just typo&lt;/P&gt;&lt;P&gt;Considering code has no syntax issues what would went wrong with same code when its called below bronze layer notebook to have just one notebook instead of two.&lt;/P&gt;</description>
      <pubDate>Wed, 27 Dec 2023 17:42:24 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/dlt-pipeline-issue-failed-to-read-dataset-dataset-is-not-defined/m-p/55828#M2071</guid>
      <dc:creator>Dlt</dc:creator>
      <dc:date>2023-12-27T17:42:24Z</dc:date>
    </item>
    <item>
      <title>Re: DLT Pipeline issue - Failed to read dataset .Dataset is not defined in the pipeline.</title>
      <link>https://community.databricks.com/t5/get-started-discussions/dlt-pipeline-issue-failed-to-read-dataset-dataset-is-not-defined/m-p/55832#M2073</link>
      <description>&lt;P&gt;Maybe you materialized the table and later substitute it with temporary table ( just guess).&lt;/P&gt;&lt;P&gt;There were some changes recently that you have only tables and materialized views only in DLT and they let use legacy syntax , so there is chance e.g. that something run on certain version of DLT pipeline and is not working on new version ( and you don't have control over version) .&lt;/P&gt;&lt;P&gt;Again that is just guess, as I did not saw your code and piepliens before and after changes and artifacts created by DLT.&lt;/P&gt;&lt;P&gt;When you Merge code and pipelines&amp;nbsp; there is always chance something goes wrong , especially in DLT as you don't have control over objects like in classic approach &lt;span class="lia-unicode-emoji" title=":confused_face:"&gt;😕&lt;/span&gt;&lt;/P&gt;&lt;P&gt;I could not replicated your issue with code that you provided.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 27 Dec 2023 17:57:01 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/dlt-pipeline-issue-failed-to-read-dataset-dataset-is-not-defined/m-p/55832#M2073</guid>
      <dc:creator>Wojciech_BUK</dc:creator>
      <dc:date>2023-12-27T17:57:01Z</dc:date>
    </item>
    <item>
      <title>Re: DLT Pipeline issue - Failed to read dataset .Dataset is not defined in the pipeline.</title>
      <link>https://community.databricks.com/t5/get-started-discussions/dlt-pipeline-issue-failed-to-read-dataset-dataset-is-not-defined/m-p/55873#M2075</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;/P&gt;&lt;P&gt;I dropped all existing objects, deleted old DLT pipeline and creating a new one with same name but same problem is seen.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Not sure why it complains about Temporary tables those would be created at runtime , even i tried to remove temporay flag but same problems.&amp;nbsp; Not sure what's wrong here, i am running out of options here.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 28 Dec 2023 08:29:04 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/dlt-pipeline-issue-failed-to-read-dataset-dataset-is-not-defined/m-p/55873#M2075</guid>
      <dc:creator>Dlt</dc:creator>
      <dc:date>2023-12-28T08:29:04Z</dc:date>
    </item>
    <item>
      <title>Re: DLT Pipeline issue - Failed to read dataset .Dataset is not defined in the pipeline.</title>
      <link>https://community.databricks.com/t5/get-started-discussions/dlt-pipeline-issue-failed-to-read-dataset-dataset-is-not-defined/m-p/55874#M2076</link>
      <description>&lt;P&gt;You can attach your notebook with code there.&lt;BR /&gt;I did now sawy your code or full trace.&lt;BR /&gt;&lt;BR /&gt;If I were you, I would get exact line of code where you have error and remove that etinre dlt table section and chcek if this will be working.&lt;BR /&gt;Then i would add it back trying to resolve the error one by one, maybe you will find pattern.&lt;BR /&gt;But you can also attach your code (as file), so somone can import it and help you..&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 28 Dec 2023 09:06:45 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/dlt-pipeline-issue-failed-to-read-dataset-dataset-is-not-defined/m-p/55874#M2076</guid>
      <dc:creator>Wojciech_BUK</dc:creator>
      <dc:date>2023-12-28T09:06:45Z</dc:date>
    </item>
    <item>
      <title>Re: DLT Pipeline issue - Failed to read dataset .Dataset is not defined in the pipeline.</title>
      <link>https://community.databricks.com/t5/get-started-discussions/dlt-pipeline-issue-failed-to-read-dataset-dataset-is-not-defined/m-p/55878#M2078</link>
      <description>&lt;P&gt;Hello ,&amp;nbsp;&lt;/P&gt;&lt;P&gt;If i refer to above code you created then error is like below&amp;nbsp;pyspark.errors.exceptions.AnalysisException: Failed to read dataset 'Temp_Table'. Dataset is not defined in the pipeline.&amp;nbsp; for each of 5 Temp tables&amp;nbsp;&lt;/P&gt;&lt;P&gt;Below is flow at high level for my DLT Pipeline.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Step1 - 5 Bronze level tables are created and loaded from JSON files&lt;/P&gt;&lt;P&gt;Step2 - 5 Temp tables are created from 5 bronze tables ( created in step 1)&amp;nbsp; above with Boolean bad flag ( derived)&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Step3 -&amp;nbsp; 5 Clean and 5 Quarantine tables are created by seperating Good &amp;amp; Bad data based on Bad Flag.&lt;/P&gt;&lt;P&gt;Step4 - 5 Gold layer tables are created from 5 clean tables created in Step 3.&lt;/P&gt;&lt;P&gt;Earlier i had separate notebook for each step which worked great. But when i combined all these into one notebook i am running into issues which i am NOT able to understand.&amp;nbsp; Each table is in separate cell in all steps as such.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 28 Dec 2023 09:53:53 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/dlt-pipeline-issue-failed-to-read-dataset-dataset-is-not-defined/m-p/55878#M2078</guid>
      <dc:creator>Dlt</dc:creator>
      <dc:date>2023-12-28T09:53:53Z</dc:date>
    </item>
    <item>
      <title>Re: DLT Pipeline issue - Failed to read dataset .Dataset is not defined in the pipeline.</title>
      <link>https://community.databricks.com/t5/get-started-discussions/dlt-pipeline-issue-failed-to-read-dataset-dataset-is-not-defined/m-p/55882#M2079</link>
      <description>&lt;P&gt;I am sorry but information you are providing is not helping at all.&amp;nbsp;&lt;BR /&gt;Plase dump your code there.&lt;/P&gt;</description>
      <pubDate>Thu, 28 Dec 2023 10:51:45 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/dlt-pipeline-issue-failed-to-read-dataset-dataset-is-not-defined/m-p/55882#M2079</guid>
      <dc:creator>Wojciech_BUK</dc:creator>
      <dc:date>2023-12-28T10:51:45Z</dc:date>
    </item>
    <item>
      <title>Re: DLT Pipeline issue - Failed to read dataset .Dataset is not defined in the pipeline.</title>
      <link>https://community.databricks.com/t5/get-started-discussions/dlt-pipeline-issue-failed-to-read-dataset-dataset-is-not-defined/m-p/56049#M2092</link>
      <description>&lt;P&gt;Issue is fixed now . I tried using live qualifier for all the tables I used and then it started working&lt;/P&gt;&lt;P&gt;Thanks for all your help&lt;/P&gt;&lt;P&gt;Thanks&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 02 Jan 2024 08:36:58 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/dlt-pipeline-issue-failed-to-read-dataset-dataset-is-not-defined/m-p/56049#M2092</guid>
      <dc:creator>Dlt</dc:creator>
      <dc:date>2024-01-02T08:36:58Z</dc:date>
    </item>
  </channel>
</rss>

