<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: [PATH_NOT_FOUND] Structured Streaming uses wrong checkpoint location in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/path-not-found-structured-streaming-uses-wrong-checkpoint/m-p/143774#M52224</link>
    <description>&lt;P&gt;Hello cgrant, and thank you for getting back to me on this matter.&lt;/P&gt;&lt;P&gt;I admit I changed to code to use the default schema and didn't preserve the original code.&lt;/P&gt;&lt;P&gt;However, reading your answer, it's likely that the error actually originated from the readStream statement rather than the writeStream statement, in which case the cause of the error would be typing a bad Volume path in the "source" variable, calling readStream, and subsequently not re-running the cell containing readStream. I'm going to accept your answer as the (very likely) solution based on this, thank you.&lt;/P&gt;</description>
    <pubDate>Mon, 12 Jan 2026 15:52:25 GMT</pubDate>
    <dc:creator>csondergaardp</dc:creator>
    <dc:date>2026-01-12T15:52:25Z</dc:date>
    <item>
      <title>[PATH_NOT_FOUND] Structured Streaming uses wrong checkpoint location</title>
      <link>https://community.databricks.com/t5/data-engineering/path-not-found-structured-streaming-uses-wrong-checkpoint/m-p/143443#M52177</link>
      <description>&lt;P&gt;I'm trying to perform a simple example using structured streaming on a directory created as a Volume. The use case is purely educational; I am investigating various forms of triggers. Basic info:&lt;/P&gt;&lt;P&gt;Catalog: "dev_catalog"&lt;BR /&gt;Schema: "stream"&lt;BR /&gt;Volume: "streaming_basics"&lt;BR /&gt;custom variable "source" with value "/Volumes/dev_catalog/stream/streaming_basics/"&lt;/P&gt;&lt;P&gt;When running a cell calling the writeStream() method, passing the path I would like to use for checkpointing, I find that Databricks inserts the default schema into the path instead of the desired "stream" schema. The cell contains the following code:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;print(source)
writestream = (df.writeStream
        .option('checkpointLocation', f'{source}AppendCheckpoint')
        .outputMode('append')
        .queryName('DefaultTrigger')
        .toTable('stream.AppendTable')
    )&lt;/LI-CODE&gt;&lt;P&gt;And the console outputs the following (value of variable 'source' and error message):&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;/Volumes/dev_catalog/stream/streaming_basics/
[PATH_NOT_FOUND] Path does not exist: /Volumes/dev_catalog/default/streaming_basics/. SQLSTATE: 42K03&lt;/LI-CODE&gt;&lt;P&gt;This is baffling to me.&lt;/P&gt;&lt;P&gt;Additional info: Earlier in the notebook, I am running the following two cells:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;%sql
DROP DATABASE IF EXISTS stream CASCADE;
CREATE DATABASE IF NOT EXISTS stream;&lt;/LI-CODE&gt;&lt;LI-CODE lang="markup"&gt;%sql
USE SCHEMA stream;&lt;/LI-CODE&gt;&lt;P&gt;The docs say that "DATABASE" is an alias for "SCHEMA", so the two can be used interchangeably. But even so, this shouldn't be an issue since the schema "stream" is specified explicitly in my 'source' variable. From the documentation on Volumes, any directory or file in a Volume is specified using:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;/Volumes/&amp;lt;catalog_identifier&amp;gt;/&amp;lt;schema_identifier&amp;gt;/&amp;lt;volume_identifier&amp;gt;/&amp;lt;path&amp;gt;/&amp;lt;file_name&amp;gt;&lt;/LI-CODE&gt;&lt;P&gt;This aligns with the path I've specified in the "source" variable.&lt;/P&gt;&lt;P&gt;More additional info: I've previously run another structured streaming educational example in a different notebook&amp;nbsp;&lt;EM&gt;on the same cluster&lt;/EM&gt; in the "dev_catalog.default" schema, with a volume of the same name. The checkpointLocation which the writeStream method apparently uses aligns with the location I specified in this previous example. In between sessions, I've stopped the cluster and started it again. I include this information because the only explanation I can come up with myself based on the error message is that the cluster cached information from the previous streaming job (in the default schema) and is now using the cached information - either the variable "source" or the call to "writeStream()" itself.&lt;/P&gt;&lt;P&gt;Any explanation for this behaviour would be much appreciated, as well as tips on how to overcome it.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 09 Jan 2026 09:25:16 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/path-not-found-structured-streaming-uses-wrong-checkpoint/m-p/143443#M52177</guid>
      <dc:creator>csondergaardp</dc:creator>
      <dc:date>2026-01-09T09:25:16Z</dc:date>
    </item>
    <item>
      <title>Re: [PATH_NOT_FOUND] Structured Streaming uses wrong checkpoint location</title>
      <link>https://community.databricks.com/t5/data-engineering/path-not-found-structured-streaming-uses-wrong-checkpoint/m-p/143650#M52213</link>
      <description>&lt;P&gt;Your checkpoint code looks correct.&lt;/P&gt;
&lt;P&gt;What is the source of `df`? Is it `/Volumes/dev_catalog/default/streaming_basics/` ? The path looks incorrect - add `stream` to it.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 11 Jan 2026 23:03:12 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/path-not-found-structured-streaming-uses-wrong-checkpoint/m-p/143650#M52213</guid>
      <dc:creator>cgrant</dc:creator>
      <dc:date>2026-01-11T23:03:12Z</dc:date>
    </item>
    <item>
      <title>Re: [PATH_NOT_FOUND] Structured Streaming uses wrong checkpoint location</title>
      <link>https://community.databricks.com/t5/data-engineering/path-not-found-structured-streaming-uses-wrong-checkpoint/m-p/143774#M52224</link>
      <description>&lt;P&gt;Hello cgrant, and thank you for getting back to me on this matter.&lt;/P&gt;&lt;P&gt;I admit I changed to code to use the default schema and didn't preserve the original code.&lt;/P&gt;&lt;P&gt;However, reading your answer, it's likely that the error actually originated from the readStream statement rather than the writeStream statement, in which case the cause of the error would be typing a bad Volume path in the "source" variable, calling readStream, and subsequently not re-running the cell containing readStream. I'm going to accept your answer as the (very likely) solution based on this, thank you.&lt;/P&gt;</description>
      <pubDate>Mon, 12 Jan 2026 15:52:25 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/path-not-found-structured-streaming-uses-wrong-checkpoint/m-p/143774#M52224</guid>
      <dc:creator>csondergaardp</dc:creator>
      <dc:date>2026-01-12T15:52:25Z</dc:date>
    </item>
  </channel>
</rss>

