<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic dlt Streaming Checkpoint Not Found in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/dlt-streaming-checkpoint-not-found/m-p/87071#M37370</link>
    <description>&lt;P&gt;I am using Delta Live Tables and have my pipeline defined using the code below. My understanding is that a checkpoint is automatically set when using Delta Live Tables. I am using the Unity Catalog and Schema settings in the pipeline as the storage destination.&lt;BR /&gt;&lt;BR /&gt;Since I am reading JSON messages and many files are being created, I want to eventually run a cleanup process to delete the old files that have already been written to the streaming table. I thought I could do this by looking at the checkpoint file. But I am unable to find where the checkpoints are being written or how i can access them. When i try to manually set a checkpoint directory, nothing gets created when the pipeline runs.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/97035"&gt;@Dlt&lt;/a&gt;.table(
    name="newdata_raw",
    table_properties={"quality": "bronze"},
    temporary=False,
)
def create_table():
    query = (
        spark.readStream.format("cloudFiles")
        .schema(schema)
        .option("cloudFiles.format", "json")
        .load(sink_dir + "partition=*/")
        .selectExpr("newRecord.*")
        .withColumn("LOAD_DT", to_timestamp(current_timestamp()))
    )
    return query&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Sat, 31 Aug 2024 19:02:15 GMT</pubDate>
    <dc:creator>ggsmith</dc:creator>
    <dc:date>2024-08-31T19:02:15Z</dc:date>
    <item>
      <title>dlt Streaming Checkpoint Not Found</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-streaming-checkpoint-not-found/m-p/87071#M37370</link>
      <description>&lt;P&gt;I am using Delta Live Tables and have my pipeline defined using the code below. My understanding is that a checkpoint is automatically set when using Delta Live Tables. I am using the Unity Catalog and Schema settings in the pipeline as the storage destination.&lt;BR /&gt;&lt;BR /&gt;Since I am reading JSON messages and many files are being created, I want to eventually run a cleanup process to delete the old files that have already been written to the streaming table. I thought I could do this by looking at the checkpoint file. But I am unable to find where the checkpoints are being written or how i can access them. When i try to manually set a checkpoint directory, nothing gets created when the pipeline runs.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/97035"&gt;@Dlt&lt;/a&gt;.table(
    name="newdata_raw",
    table_properties={"quality": "bronze"},
    temporary=False,
)
def create_table():
    query = (
        spark.readStream.format("cloudFiles")
        .schema(schema)
        .option("cloudFiles.format", "json")
        .load(sink_dir + "partition=*/")
        .selectExpr("newRecord.*")
        .withColumn("LOAD_DT", to_timestamp(current_timestamp()))
    )
    return query&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 31 Aug 2024 19:02:15 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-streaming-checkpoint-not-found/m-p/87071#M37370</guid>
      <dc:creator>ggsmith</dc:creator>
      <dc:date>2024-08-31T19:02:15Z</dc:date>
    </item>
    <item>
      <title>Re: dlt Streaming Checkpoint Not Found</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-streaming-checkpoint-not-found/m-p/87115#M37380</link>
      <description>&lt;P&gt;Hi &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/115999"&gt;@ggsmith&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;If you use Delta Live Tables then checkpoints are stored under the storage location specified in the DLT settings. Each table gets a dedicated directory under&amp;nbsp;&lt;/SPAN&gt;storage_location/checkpoints/&amp;lt;dlt_table_name&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Slash_0-1725218574958.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/10842iD458FAD67C6D5477/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Slash_0-1725218574958.png" alt="Slash_0-1725218574958.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 01 Sep 2024 19:26:02 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-streaming-checkpoint-not-found/m-p/87115#M37380</guid>
      <dc:creator>szymon_dybczak</dc:creator>
      <dc:date>2024-09-01T19:26:02Z</dc:date>
    </item>
    <item>
      <title>Re: dlt Streaming Checkpoint Not Found</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-streaming-checkpoint-not-found/m-p/92735#M38524</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/110502"&gt;@szymon_dybczak&lt;/a&gt;&amp;nbsp; how can I access the checkpoint? is there any way i can delete the checkpoints stored in the storage location ? The reason I want to cleanup checkpoint is because spark.sql.shuffle.partition change is not taking effect and as per some discussions on the community, any change in above parameters takes effect after cleaning up existing checkpoints since the value of this parameter is saved there.&lt;/P&gt;</description>
      <pubDate>Fri, 04 Oct 2024 07:32:19 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-streaming-checkpoint-not-found/m-p/92735#M38524</guid>
      <dc:creator>PushkarDeole</dc:creator>
      <dc:date>2024-10-04T07:32:19Z</dc:date>
    </item>
    <item>
      <title>Re: dlt Streaming Checkpoint Not Found</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-streaming-checkpoint-not-found/m-p/92738#M38526</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/112308"&gt;@PushkarDeole&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;You can just go to that location in delete it manually. Or you can use dbutils. Whichever you prefer.&lt;/P&gt;</description>
      <pubDate>Fri, 04 Oct 2024 07:57:04 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-streaming-checkpoint-not-found/m-p/92738#M38526</guid>
      <dc:creator>szymon_dybczak</dc:creator>
      <dc:date>2024-10-04T07:57:04Z</dc:date>
    </item>
    <item>
      <title>Re: dlt Streaming Checkpoint Not Found</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-streaming-checkpoint-not-found/m-p/92756#M38534</link>
      <description>&lt;P&gt;Thanks for the quick response&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/110502"&gt;@szymon_dybczak&lt;/a&gt;&amp;nbsp;and appreciate it.&amp;nbsp; Probably I am missing something. I will check the dbutils part to access the location,&lt;/P&gt;&lt;P&gt;however on your first point, I am not sure how can I directly go to the location and delete it manually. I think that's the main question I have is how can I access that location directly without using any utility ?&lt;/P&gt;</description>
      <pubDate>Fri, 04 Oct 2024 11:49:33 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-streaming-checkpoint-not-found/m-p/92756#M38534</guid>
      <dc:creator>PushkarDeole</dc:creator>
      <dc:date>2024-10-04T11:49:33Z</dc:date>
    </item>
    <item>
      <title>Re: dlt Streaming Checkpoint Not Found</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-streaming-checkpoint-not-found/m-p/92779#M38540</link>
      <description>&lt;P&gt;We are using Unity Catalog, so I don't see this storage location option. Just the catalog &amp;amp; target schema.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 04 Oct 2024 16:10:45 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-streaming-checkpoint-not-found/m-p/92779#M38540</guid>
      <dc:creator>ggsmith</dc:creator>
      <dc:date>2024-10-04T16:10:45Z</dc:date>
    </item>
    <item>
      <title>Re: dlt Streaming Checkpoint Not Found</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-streaming-checkpoint-not-found/m-p/117517#M45514</link>
      <description>&lt;P&gt;I have the same issue as&amp;nbsp;ggsmith,&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/110502"&gt;@szymon_dybczak&lt;/a&gt;&amp;nbsp;.&amp;nbsp;We use the UC as the destination, and so I do not see a storage path:&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="BF7_0-1746192560896.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/16427iDC6377E1251EDDB4/image-size/medium?v=v2&amp;amp;px=400" role="button" title="BF7_0-1746192560896.png" alt="BF7_0-1746192560896.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;What am I missing?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 02 May 2025 13:31:43 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-streaming-checkpoint-not-found/m-p/117517#M45514</guid>
      <dc:creator>BF7</dc:creator>
      <dc:date>2025-05-02T13:31:43Z</dc:date>
    </item>
    <item>
      <title>Re: dlt Streaming Checkpoint Not Found</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-streaming-checkpoint-not-found/m-p/119578#M45920</link>
      <description>&lt;P&gt;same here&lt;/P&gt;</description>
      <pubDate>Mon, 19 May 2025 06:41:25 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-streaming-checkpoint-not-found/m-p/119578#M45920</guid>
      <dc:creator>a_user12</dc:creator>
      <dc:date>2025-05-19T06:41:25Z</dc:date>
    </item>
  </channel>
</rss>

