<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Getting error hadoop_azure_shaded.com.microsoft.azure.storage.StorageException: The specified bl in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/getting-error-hadoop-azure-shaded-com-microsoft-azure-storage/m-p/161071#M54986</link>
    <description>&lt;P&gt;Its generally due to&amp;nbsp;&lt;STRONG&gt;race conditions&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;when Spark checks for existing partition files before writing combined with Azure Blob Storage's eventual consistency mode.&lt;/P&gt;&lt;H2&gt;&lt;FONT size="3"&gt;You can follow below&lt;/FONT&gt;&lt;/H2&gt;&lt;P&gt;&lt;FONT size="3"&gt;1. &lt;STRONG&gt;Switch to Delta Lake&lt;/STRONG&gt;&amp;nbsp;- You can use Delta Lake format instead of Parquet with append mode. Delta handles concurrency and append operations reliably.&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT size="3"&gt;2. &lt;STRONG&gt;Use ABFS/ABFSS&lt;/STRONG&gt; &lt;STRONG&gt;Protocol&amp;nbsp;&lt;/STRONG&gt;in Azure Data Lake Storage - Switch from&amp;nbsp;wasbs://&amp;nbsp;to&amp;nbsp;abfss://&amp;nbsp;as it has better consistency guarantees.&amp;nbsp;&lt;/FONT&gt;&lt;FONT size="3"&gt;It requires your storage account to have hierarchical namespace enabled (ADLS Gen2). Enable it and use it for good results. Use &lt;STRONG&gt;Unity Catalog volumes&lt;/STRONG&gt; if feasible.&lt;/FONT&gt;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;spark.read.load("abfss://container@storageaccount.dfs.core.windows.net/data_path")&lt;/LI-CODE&gt;&lt;P&gt;&lt;FONT size="3"&gt;More details &lt;A href="https://docs.databricks.com/aws/en/connect/storage/azure-storage" target="_self"&gt;here&lt;/A&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT size="3"&gt;3. &lt;STRONG&gt;Use Overwrite&lt;/STRONG&gt; &lt;STRONG&gt;with Partition Mode &lt;/STRONG&gt;if append semantics aren't strictly required per partition.&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT size="3"&gt;4. Add &lt;STRONG&gt;Retry Logic&lt;/STRONG&gt;&amp;nbsp;- Wrap the write operation with retry logic to handle transient Azure Storage errors.&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT size="3"&gt;5. Check &lt;STRONG&gt;Storage Configuration -&lt;/STRONG&gt;&amp;nbsp;Ensure you are using the latest Hadoop Azure connector version and that the storage account has optimal consistency settings&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT size="3"&gt;You can use Delta Lake as it's ACID-compliant, handles concurrent writes safely and is the best for production workloads on Databricks.&lt;/FONT&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 01 Jul 2026 11:53:05 GMT</pubDate>
    <dc:creator>balajij8</dc:creator>
    <dc:date>2026-07-01T11:53:05Z</dc:date>
    <item>
      <title>Getting error hadoop_azure_shaded.com.microsoft.azure.storage.StorageException: The specified blob d</title>
      <link>https://community.databricks.com/t5/data-engineering/getting-error-hadoop-azure-shaded-com-microsoft-azure-storage/m-p/161069#M54984</link>
      <description>&lt;P&gt;I am exporting parquet files (partitioned by id) in append mode. However, I encounter errors occasionally, while other times the job completes successfully.&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;Apache Spark Exception: Exception thrown in awaitResult: hadoop_azure_shaded.com.microsoft.azure.storage.StorageException: The specified blob does not exist.&lt;/LI-CODE&gt;&lt;P&gt;Currently, the storage access is configured as follows:`wasbs://&amp;lt;container-name&amp;gt;@&amp;lt;storage-account-name&amp;gt;.blob.core.windows.net/&amp;lt;directory-name&amp;gt;`&lt;/P&gt;&lt;P&gt;For exporting using append mode. anyone can help?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 01 Jul 2026 11:27:15 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/getting-error-hadoop-azure-shaded-com-microsoft-azure-storage/m-p/161069#M54984</guid>
      <dc:creator>hanifmusa</dc:creator>
      <dc:date>2026-07-01T11:27:15Z</dc:date>
    </item>
    <item>
      <title>Re: Getting error hadoop_azure_shaded.com.microsoft.azure.storage.StorageException: The specified bl</title>
      <link>https://community.databricks.com/t5/data-engineering/getting-error-hadoop-azure-shaded-com-microsoft-azure-storage/m-p/161071#M54986</link>
      <description>&lt;P&gt;Its generally due to&amp;nbsp;&lt;STRONG&gt;race conditions&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;when Spark checks for existing partition files before writing combined with Azure Blob Storage's eventual consistency mode.&lt;/P&gt;&lt;H2&gt;&lt;FONT size="3"&gt;You can follow below&lt;/FONT&gt;&lt;/H2&gt;&lt;P&gt;&lt;FONT size="3"&gt;1. &lt;STRONG&gt;Switch to Delta Lake&lt;/STRONG&gt;&amp;nbsp;- You can use Delta Lake format instead of Parquet with append mode. Delta handles concurrency and append operations reliably.&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT size="3"&gt;2. &lt;STRONG&gt;Use ABFS/ABFSS&lt;/STRONG&gt; &lt;STRONG&gt;Protocol&amp;nbsp;&lt;/STRONG&gt;in Azure Data Lake Storage - Switch from&amp;nbsp;wasbs://&amp;nbsp;to&amp;nbsp;abfss://&amp;nbsp;as it has better consistency guarantees.&amp;nbsp;&lt;/FONT&gt;&lt;FONT size="3"&gt;It requires your storage account to have hierarchical namespace enabled (ADLS Gen2). Enable it and use it for good results. Use &lt;STRONG&gt;Unity Catalog volumes&lt;/STRONG&gt; if feasible.&lt;/FONT&gt;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;spark.read.load("abfss://container@storageaccount.dfs.core.windows.net/data_path")&lt;/LI-CODE&gt;&lt;P&gt;&lt;FONT size="3"&gt;More details &lt;A href="https://docs.databricks.com/aws/en/connect/storage/azure-storage" target="_self"&gt;here&lt;/A&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT size="3"&gt;3. &lt;STRONG&gt;Use Overwrite&lt;/STRONG&gt; &lt;STRONG&gt;with Partition Mode &lt;/STRONG&gt;if append semantics aren't strictly required per partition.&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT size="3"&gt;4. Add &lt;STRONG&gt;Retry Logic&lt;/STRONG&gt;&amp;nbsp;- Wrap the write operation with retry logic to handle transient Azure Storage errors.&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT size="3"&gt;5. Check &lt;STRONG&gt;Storage Configuration -&lt;/STRONG&gt;&amp;nbsp;Ensure you are using the latest Hadoop Azure connector version and that the storage account has optimal consistency settings&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT size="3"&gt;You can use Delta Lake as it's ACID-compliant, handles concurrent writes safely and is the best for production workloads on Databricks.&lt;/FONT&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 01 Jul 2026 11:53:05 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/getting-error-hadoop-azure-shaded-com-microsoft-azure-storage/m-p/161071#M54986</guid>
      <dc:creator>balajij8</dc:creator>
      <dc:date>2026-07-01T11:53:05Z</dc:date>
    </item>
  </channel>
</rss>

