<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Frequent “GetPathStatus” and “GetBlobProperties” PathNotFound Errors on Azure Storage in Databri in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/frequent-getpathstatus-and-getblobproperties-pathnotfound-errors/m-p/96388#M39271</link>
    <description>&lt;P class=""&gt;Hi,&lt;/P&gt;&lt;P class=""&gt;I confirm that I have checked it, and everything seems to be in order—both path and permissions are definitely in place, as we are also successfully writing data to the container. I noticed that these messages come up in several situations:&lt;/P&gt;&lt;P class=""&gt;1.In SQL Warehouse queries&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;2. During &lt;/SPAN&gt;spark.read...&lt;SPAN class=""&gt; operations&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;3. During &lt;/SPAN&gt;spark.write...&lt;SPAN class=""&gt; operations&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;We are using DBR 13.3 in this workspace. Any ideas on why so many storage-related messages are appearing? It only started happening after we enabled diagnostic settings in Azure.&lt;/P&gt;</description>
    <pubDate>Mon, 28 Oct 2024 07:28:26 GMT</pubDate>
    <dc:creator>h_h_ak</dc:creator>
    <dc:date>2024-10-28T07:28:26Z</dc:date>
    <item>
      <title>Frequent “GetPathStatus” and “GetBlobProperties” PathNotFound Errors on Azure Storage in Databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/frequent-getpathstatus-and-getblobproperties-pathnotfound-errors/m-p/96151#M39227</link>
      <description>&lt;P class=""&gt;We are encountering frequent &lt;SPAN class=""&gt;GetPathStatus&lt;/SPAN&gt; and &lt;SPAN class=""&gt;GetBlobProperties&lt;/SPAN&gt; errors when trying to access Azure Data Lake Storage (ADLS) paths through our Databricks environment. The errors consistently return a &lt;SPAN class=""&gt;404 PathNotFound&lt;/SPAN&gt; status for paths that should be accessible.&lt;/P&gt;&lt;P class=""&gt;&lt;STRONG&gt;Context:&lt;/STRONG&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;Operation&lt;/STRONG&gt;: &lt;SPAN class=""&gt;df.write()&lt;/SPAN&gt; and &lt;SPAN class=""&gt;df.read()&lt;/SPAN&gt; operations on Databricks, attempting to access Azure storage paths.&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Storage Path&lt;/STRONG&gt;: &lt;SPAN class=""&gt;/stxxxx/src-sapecc/&lt;/SPAN&gt; and other related paths in Azure Data Lake Gen2.&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Errors Observed&lt;/STRONG&gt;:&lt;/LI&gt;&lt;LI&gt;GetPathStatus: PathNotFound&lt;/LI&gt;&lt;LI&gt;GetBlobProperties: BlobNotFound&lt;/LI&gt;&lt;/UL&gt;&lt;P class=""&gt;&lt;STRONG&gt;Error Count&lt;/STRONG&gt;: The errors are recurring frequently, as seen in the attached logs, which indicate multiple instances of the &lt;SPAN class=""&gt;PathNotFound&lt;/SPAN&gt; error with status code 404.&lt;/P&gt;&lt;P class=""&gt;&lt;STRONG&gt;Timestamps&lt;/STRONG&gt;: Errors occur across multiple timestamps (see attached logs for details).&lt;/P&gt;&lt;P class=""&gt;&lt;STRONG&gt;Attached Screenshot&lt;/STRONG&gt;: Logs showing details of the error, including the operation name, status codes, and paths.&lt;/P&gt;&lt;P class=""&gt;Could you please assist in identifying why these &lt;SPAN class=""&gt;PathNotFound&lt;/SPAN&gt; and &lt;SPAN class=""&gt;BlobNotFound&lt;/SPAN&gt; errors are occurring despite correct configuration and permissions? Additionally, if there’s any further configuration required on the Azure or Databricks side to resolve this, please advise.Thanks in advance..&lt;/P&gt;&lt;P class=""&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="h_h_ak_0-1729867508203.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/12305i592FFDE7533C17AA/image-size/medium?v=v2&amp;amp;px=400" role="button" title="h_h_ak_0-1729867508203.png" alt="h_h_ak_0-1729867508203.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 25 Oct 2024 14:50:20 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/frequent-getpathstatus-and-getblobproperties-pathnotfound-errors/m-p/96151#M39227</guid>
      <dc:creator>h_h_ak</dc:creator>
      <dc:date>2024-10-25T14:50:20Z</dc:date>
    </item>
    <item>
      <title>Re: Frequent “GetPathStatus” and “GetBlobProperties” PathNotFound Errors on Azure Storage in Databri</title>
      <link>https://community.databricks.com/t5/data-engineering/frequent-getpathstatus-and-getblobproperties-pathnotfound-errors/m-p/96164#M39233</link>
      <description>&lt;P&gt;&lt;SPAN&gt;Hi,&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;1) Ensure that the paths you are trying to access are correct and exist in the ADLS Gen2 storage account.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;2)&amp;nbsp;&amp;nbsp;Verify that the Databricks cluster has the necessary permissions to access the ADLS Gen2 paths&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Br&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 25 Oct 2024 15:38:07 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/frequent-getpathstatus-and-getblobproperties-pathnotfound-errors/m-p/96164#M39233</guid>
      <dc:creator>saurabh18cs</dc:creator>
      <dc:date>2024-10-25T15:38:07Z</dc:date>
    </item>
    <item>
      <title>Re: Frequent “GetPathStatus” and “GetBlobProperties” PathNotFound Errors on Azure Storage in Databri</title>
      <link>https://community.databricks.com/t5/data-engineering/frequent-getpathstatus-and-getblobproperties-pathnotfound-errors/m-p/96388#M39271</link>
      <description>&lt;P class=""&gt;Hi,&lt;/P&gt;&lt;P class=""&gt;I confirm that I have checked it, and everything seems to be in order—both path and permissions are definitely in place, as we are also successfully writing data to the container. I noticed that these messages come up in several situations:&lt;/P&gt;&lt;P class=""&gt;1.In SQL Warehouse queries&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;2. During &lt;/SPAN&gt;spark.read...&lt;SPAN class=""&gt; operations&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;3. During &lt;/SPAN&gt;spark.write...&lt;SPAN class=""&gt; operations&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;We are using DBR 13.3 in this workspace. Any ideas on why so many storage-related messages are appearing? It only started happening after we enabled diagnostic settings in Azure.&lt;/P&gt;</description>
      <pubDate>Mon, 28 Oct 2024 07:28:26 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/frequent-getpathstatus-and-getblobproperties-pathnotfound-errors/m-p/96388#M39271</guid>
      <dc:creator>h_h_ak</dc:creator>
      <dc:date>2024-10-28T07:28:26Z</dc:date>
    </item>
    <item>
      <title>Re: Frequent “GetPathStatus” and “GetBlobProperties” PathNotFound Errors on Azure Storage in Databri</title>
      <link>https://community.databricks.com/t5/data-engineering/frequent-getpathstatus-and-getblobproperties-pathnotfound-errors/m-p/96392#M39272</link>
      <description>&lt;P&gt;Testing this in incognito mode will help !!!&lt;/P&gt;</description>
      <pubDate>Mon, 28 Oct 2024 07:46:15 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/frequent-getpathstatus-and-getblobproperties-pathnotfound-errors/m-p/96392#M39272</guid>
      <dc:creator>aashish122</dc:creator>
      <dc:date>2024-10-28T07:46:15Z</dc:date>
    </item>
    <item>
      <title>Re: Frequent “GetPathStatus” and “GetBlobProperties” PathNotFound Errors on Azure Storage in Databri</title>
      <link>https://community.databricks.com/t5/data-engineering/frequent-getpathstatus-and-getblobproperties-pathnotfound-errors/m-p/96398#M39275</link>
      <description>&lt;P class=""&gt;Why do you think this will help? We have Spark and connection configuration in the cluster settings, and &lt;SPAN class=""&gt;spark.read...&lt;/SPAN&gt; or &lt;SPAN class=""&gt;write&lt;/SPAN&gt; statements are executed by the notebook. Additionally, the SQL queries are coming from outside. How would Incognito help in this scenario?&lt;/P&gt;</description>
      <pubDate>Mon, 28 Oct 2024 07:57:39 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/frequent-getpathstatus-and-getblobproperties-pathnotfound-errors/m-p/96398#M39275</guid>
      <dc:creator>h_h_ak</dc:creator>
      <dc:date>2024-10-28T07:57:39Z</dc:date>
    </item>
    <item>
      <title>Re: Frequent “GetPathStatus” and “GetBlobProperties” PathNotFound Errors on Azure Storage in Databri</title>
      <link>https://community.databricks.com/t5/data-engineering/frequent-getpathstatus-and-getblobproperties-pathnotfound-errors/m-p/102205#M41018</link>
      <description>&lt;P&gt;&lt;EM&gt;&lt;STRONG&gt;Adding answer from MSFT Support Team:&lt;/STRONG&gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Why is there _delta_log being checked when the function used is parquet.&lt;/STRONG&gt;&lt;BR /&gt;&lt;SPAN&gt;The _delta_log directory is being checked because the system is designed to scan directories and their parent directories to look for a Delta log folder. This is done to ensure that if a user is writing to a Delta table using the wrong format (e.g., using Parquet instead of Delta), the system can identify the mistake and fail the job to prevent data corruption&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;Why are all parent folders getting _deltalogs call?&lt;/STRONG&gt;&lt;BR /&gt;&lt;SPAN&gt;The system recursively checks all parent directories for the _delta_log folder to determine if any of the parent directories are Delta tables. This is part of the design to ensure that the correct table format is being used and to avoid potential issues with data integrity.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;What are the files _encryption_metadata/manifest.json and _spark_metadata being referenced for given that is not present in the folders.&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;How to remove this requests?&lt;/STRONG&gt;&lt;BR /&gt;&lt;SPAN&gt;The _encryption_metadata/manifest.json file is being checked to determine if encryption is enabled on the storage.&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;The _spark_metadata directory is typically created by streaming jobs to store metadata about the stream. Even though these files may not be present in the folders, the system checks for them as part of its standard operations.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;How to remove these requests?&lt;/STRONG&gt;&lt;BR /&gt;&lt;SPAN&gt;Currently, there is no direct way to remove these requests as they are part of the system's design to ensure data integrity and correct table format usage.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 16 Dec 2024 08:22:31 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/frequent-getpathstatus-and-getblobproperties-pathnotfound-errors/m-p/102205#M41018</guid>
      <dc:creator>h_h_ak</dc:creator>
      <dc:date>2024-12-16T08:22:31Z</dc:date>
    </item>
  </channel>
</rss>

