<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Unity Catalog Volume mounting broken by cluster environment variables (http proxy) in Administration &amp; Architecture</title>
    <link>https://community.databricks.com/t5/administration-architecture/unity-catalog-volume-mounting-broken-by-cluster-environment/m-p/134247#M4170</link>
    <description>&lt;P&gt;Unfortunately, the only solution I found was to not use the proxy globally. Good luck!&lt;/P&gt;</description>
    <pubDate>Wed, 08 Oct 2025 16:07:21 GMT</pubDate>
    <dc:creator>Seb_G</dc:creator>
    <dc:date>2025-10-08T16:07:21Z</dc:date>
    <item>
      <title>Unity Catalog Volume mounting broken by cluster environment variables (http proxy)</title>
      <link>https://community.databricks.com/t5/administration-architecture/unity-catalog-volume-mounting-broken-by-cluster-environment/m-p/92583#M1965</link>
      <description>&lt;P&gt;Hello all,&lt;BR /&gt;I have a slightly niche issue here, albeit one that others are likely to run into.&lt;/P&gt;&lt;P&gt;Using databricks on Azure, my organisation has included extended our WAN into the cloud, so that all compute clusters are granted a private IP address that can access on-prem servers (using vnet injection). One of those servers is a http/https proxy, through which all our data to non-azure systems should be trafficked. This is achieved through SCC and private vnets.&lt;/P&gt;&lt;P&gt;Recently, to permit the installation of libraries from pypi, I added the following to clusters' environmental variables: &lt;SPAN class=""&gt;http_proxy, HTTP_PROXY, https_proxy, HTTPS_PROXY. That allowed for installation from pypi on boot when adding the library to the cluster libraries (I am aware that pip --proxy can work in a notebook).&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class=""&gt;However, I've since discovered that is breaking python's access to volumes mounted in unity catalog. My volume is an azure storage account container. The proxy server is a whitelisted IP in the azure storage account (it's a public ip4). I am account admin, workplace admin, and have manually granted myself all privileges on the catalog level. Compute is a single user compute running DRV 15.4 LTS with config&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="java"&gt;spark.databricks.cluster.profile singleNode
spark.master local[*]&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN class=""&gt;The below code works as expected regardless of the environmental variables:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;path = "/Volumes/&amp;lt;catalog&amp;gt;/&amp;lt;schema&amp;gt;/&amp;lt;volume&amp;gt;/&amp;lt;folder1&amp;gt;/&amp;lt;folder2&amp;gt;/&amp;lt;folder3&amp;gt;/&amp;lt;file&amp;gt;.json"
dbutils.fs.head(path)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN class=""&gt;The following code functions as expected (reading the file) without the variables set,&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;with open(path, "r") as fp:
    json.load(fp)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN class=""&gt;but throws the following error when the proxy variables are set&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;PermissionError: [Errno 13] Permission denied: '/Volumes/&amp;lt;catalog&amp;gt;/&amp;lt;schema&amp;gt;/&amp;lt;volume&amp;gt;/&amp;lt;folder1&amp;gt;/&amp;lt;folder2&amp;gt;/&amp;lt;folder3&amp;gt;/&amp;lt;file&amp;gt;.json'&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I would really like to keep the proxy environment variables to catch all the traffic to the public internet seamlessly (e.g. library installs). What I'm hoping can be answered:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Why does volume access via dbutils/browsing unity catalog GUI work on the compute when the environmental variables are set, but vanilla python does not?&lt;/LI&gt;&lt;LI&gt;Are there any suggested workarounds to allow python to interact with the filesystem when the environment variables are set?&lt;/LI&gt;&lt;LI&gt;If I cannot set the http/s proxy environment variables, are there any other variables or spark config that could get the cluster to access pypi/maven etc via a proxy by default?&lt;/LI&gt;&lt;/UL&gt;</description>
      <pubDate>Wed, 02 Oct 2024 14:03:03 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/unity-catalog-volume-mounting-broken-by-cluster-environment/m-p/92583#M1965</guid>
      <dc:creator>Seb_G</dc:creator>
      <dc:date>2024-10-02T14:03:03Z</dc:date>
    </item>
    <item>
      <title>Re: Unity Catalog Volume mounting broken by cluster environment variables (http proxy)</title>
      <link>https://community.databricks.com/t5/administration-architecture/unity-catalog-volume-mounting-broken-by-cluster-environment/m-p/134242#M4168</link>
      <description>&lt;P&gt;Bumping this as I am having the same issue.&lt;BR /&gt;&lt;BR /&gt;Is the solution to just not define the proxy vars globally?&lt;BR /&gt;&lt;BR /&gt;Is there something to add in the NO_PROXY or the spark_conf to allow for communication within databricks between storage accounts to not go through the proxy?&lt;BR /&gt;&lt;BR /&gt;Have already tried adding in the storage accounts of use and the databricks workspace to the NO_PROXY, as well as adding the java options for the driver and executor.&lt;/P&gt;</description>
      <pubDate>Wed, 08 Oct 2025 15:54:43 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/unity-catalog-volume-mounting-broken-by-cluster-environment/m-p/134242#M4168</guid>
      <dc:creator>amartt</dc:creator>
      <dc:date>2025-10-08T15:54:43Z</dc:date>
    </item>
    <item>
      <title>Re: Unity Catalog Volume mounting broken by cluster environment variables (http proxy)</title>
      <link>https://community.databricks.com/t5/administration-architecture/unity-catalog-volume-mounting-broken-by-cluster-environment/m-p/134247#M4170</link>
      <description>&lt;P&gt;Unfortunately, the only solution I found was to not use the proxy globally. Good luck!&lt;/P&gt;</description>
      <pubDate>Wed, 08 Oct 2025 16:07:21 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/unity-catalog-volume-mounting-broken-by-cluster-environment/m-p/134247#M4170</guid>
      <dc:creator>Seb_G</dc:creator>
      <dc:date>2025-10-08T16:07:21Z</dc:date>
    </item>
    <item>
      <title>Re: Unity Catalog Volume mounting broken by cluster environment variables (http proxy)</title>
      <link>https://community.databricks.com/t5/administration-architecture/unity-catalog-volume-mounting-broken-by-cluster-environment/m-p/136244#M4281</link>
      <description>&lt;P&gt;A solution that worked, in addition to having the HTTP_PROXY and HTTPS_PROXY variables set globally, was to add the following definition to the compute policy:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;"spark_env_vars.NO_PROXY"&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt; &lt;SPAN&gt;{&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"type"&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt; &lt;SPAN&gt;"fixed"&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"value"&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt; &lt;SPAN&gt;"localhost,127.0.0.1,169.254.169.254,*.databricks.azure.com,*.azuredatabricks.net,*.databricks.azure.us,*.databricks.azure.cn,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;}&lt;BR /&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;BR /&gt;or could just put it straight into the env vars on a cluster itself.&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Mon, 27 Oct 2025 18:33:26 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/unity-catalog-volume-mounting-broken-by-cluster-environment/m-p/136244#M4281</guid>
      <dc:creator>amartt</dc:creator>
      <dc:date>2025-10-27T18:33:26Z</dc:date>
    </item>
  </channel>
</rss>

