<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Can not connect to databricks on Azure Machine Learning Compute Cluster. in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/can-not-connect-to-databricks-on-azure-machine-learning-compute/m-p/58680#M31238</link>
    <description>&lt;P&gt;Additional information I forgot to write.&lt;/P&gt;&lt;P&gt;Compute Instance has a User managed Identity in Azure, a Service Principal access is created in databricks with its Application ID. Same with the compute cluster, it has its own User Managed Identity that is also a SP in Databricks.&lt;/P&gt;&lt;P&gt;Both of them have the correct roles/rights to access clusters.&lt;/P&gt;&lt;P&gt;In local, I do "az login" to get my personal user, which also in databricks but has user.&lt;/P&gt;&lt;P&gt;In the compute cluster, I outputed the .databricks-connect file, to put it in my local computer. And tried to run the python code and it worked. I checked the Cluster logs, and I was using the Computer Cluster Service Principal which is the managed identity on Azure. So The Service Principal has the correct rights.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Etyr_0-1706612065861.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/6020iBBA0675A6CF7E3E3/image-size/medium/is-moderation-mode/true?v=v2&amp;amp;px=400" role="button" title="Etyr_0-1706612065861.png" alt="Etyr_0-1706612065861.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Tue, 30 Jan 2024 10:54:32 GMT</pubDate>
    <dc:creator>Etyr</dc:creator>
    <dc:date>2024-01-30T10:54:32Z</dc:date>
    <item>
      <title>Can not connect to databricks on Azure Machine Learning Compute Cluster.</title>
      <link>https://community.databricks.com/t5/data-engineering/can-not-connect-to-databricks-on-azure-machine-learning-compute/m-p/58674#M31236</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;I'am having an issue where I have :&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;A local machine in WSL 1,&lt;UL&gt;&lt;LI&gt;Python 3.8 and 3.10&lt;/LI&gt;&lt;LI&gt;OpenJDK 19.0.1 (version "build 19.0.1+10-21")&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;LI&gt;Compute Instance In Azure Machine Learning&lt;UL&gt;&lt;LI&gt;Python 3.8&lt;/LI&gt;&lt;LI&gt;OpenJDK 8 (version "1.8.0_392")&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;LI&gt;Compute Cluster in Azure Machine Learning with custom Dockerfile&lt;UL&gt;&lt;LI&gt;Python 3.10&lt;/LI&gt;&lt;LI&gt;OpenJDK 19.0.1&amp;nbsp;(version "build 19.0.1+10-21")&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;And I can not acces/launch my pyspark in compute cluster where others I can. Here is how I install OpenJDK in Compute Cluster (dockerfile) + local WSL:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="ruby"&gt;RUN wget https://download.java.net/java/GA/jdk19.0.1/afdd2e245b014143b62ccb916125e3ce/10/GPL/openjdk-19.0.1_linux-x64_bin.tar.gz \
    &amp;amp;&amp;amp; tar xvf openjdk-19.0.1_linux-x64_bin.tar.gz \
    &amp;amp;&amp;amp; mv jdk-19.0.1 /opt/ \
    &amp;amp;&amp;amp; rm openjdk-19.0.1_linux-x64_bin.tar.gz

ENV JAVA_HOME /opt/jdk-19.0.1
ENV PATH="${PATH}:$JAVA_HOME/bin"&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In both of them I have this output of `java --version` to:&lt;/P&gt;&lt;P class="lia-indent-padding-left-30px"&gt;openjdk 19.0.1 2022-10-18&lt;BR /&gt;OpenJDK Runtime Environment (build 19.0.1+10-21)&lt;BR /&gt;OpenJDK 64-Bit Server VM (build 19.0.1+10-21, mixed mode, sharing)&lt;/P&gt;&lt;P&gt;I did not installed OpenJDK 8 on the compute instance, it was preinstalled by Azure in the VM.&lt;/P&gt;&lt;P&gt;Both Compute Instance and Compute Cluster are in the same subnet in Azure, so they don't have network issue to access databricks (all private endpoints are working).&lt;/P&gt;&lt;P&gt;Here is the error I have when launching a simple spark command in Compute Cluster:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="ruby"&gt;Exception in thread "main" java.lang.ExceptionInInitializerError
	at org.apache.spark.deploy.SparkSubmitArguments.$anonfun$loadEnvironmentArguments$5(SparkSubmitArguments.scala:163)
	at scala.Option.orElse(Option.scala:447)
	at org.apache.spark.deploy.SparkSubmitArguments.loadEnvironmentArguments(SparkSubmitArguments.scala:163)
	at org.apache.spark.deploy.SparkSubmitArguments.&amp;lt;init&amp;gt;(SparkSubmitArguments.scala:118)
	at org.apache.spark.deploy.SparkSubmit$$anon$2$$anon$3.&amp;lt;init&amp;gt;(SparkSubmit.scala:1046)
	at org.apache.spark.deploy.SparkSubmit$$anon$2.parseArguments(SparkSubmit.scala:1046)
	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:85)
	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1063)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1072)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.net.UnknownHostException: 019325cc430b495e91604bf9052029ac000000: 019325cc430b495e91604bf9052029ac000000: Name or service not known
	at java.base/java.net.InetAddress.getLocalHost(InetAddress.java:1776)
	at org.apache.spark.util.Utils$.findLocalInetAddress(Utils.scala:1211)
	at org.apache.spark.util.Utils$.localIpAddress$lzycompute(Utils.scala:1204)
	at org.apache.spark.util.Utils$.localIpAddress(Utils.scala:1204)
	at org.apache.spark.util.Utils$.$anonfun$localCanonicalHostName$1(Utils.scala:1261)
	at scala.Option.getOrElse(Option.scala:189)
	at org.apache.spark.util.Utils$.localCanonicalHostName(Utils.scala:1261)
	at org.apache.spark.internal.config.package$.&amp;lt;init&amp;gt;(package.scala:1080)
	at org.apache.spark.internal.config.package$.&amp;lt;clinit&amp;gt;(package.scala)
	... 10 more
Caused by: java.net.UnknownHostException: 019325cc430b495e91604bf9052029ac000000: Name or service not known
	at java.base/java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
	at java.base/java.net.Inet6AddressImpl.lookupAllHostAddr(Inet6AddressImpl.java:52)
	at java.base/java.net.InetAddress$PlatformResolver.lookupByName(InetAddress.java:1059)
	at java.base/java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1668)
	at java.base/java.net.InetAddress$NameServiceAddresses.get(InetAddress.java:1003)
	at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1658)
	at java.base/java.net.InetAddress.getLocalHost(InetAddress.java:1771)
	... 18 more&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;From the compute cluster, I can curl to the databricks API to genetate a Personnal Access Token.&lt;/P&gt;&lt;P&gt;I also did a class that will automatically generate an Oauth2 token from Azure then use it to generate a databricks PAT then set up "databricks-connect":&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;stdin_list = [
    "https://" + settings.databricks_address,
    DatabricksTokenManager(settings.databricks_address).pat,
    settings.databricks_cluster_id,
    settings.databricks_org_id,
    str(settings.databricks_port),
]

stdin_string = "\n".join(stdin_list)
with subprocess.Popen(
    (["echo", "-e", stdin_string]), stdout=subprocess.PIPE
) as echo:
    subprocess.check_output(
        ("databricks-connect", "configure"), stdin=echo.stdout
    )
    echo.wait()&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;settings.databricks_address have a string if this format "adb-xxxxxxxxxxxx.x.azuredatabricks.net/"&lt;/P&gt;&lt;P&gt;settings.databricks_cluster_id is taken from the databricks URL and a cluster, same for the organisation id and port.&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp;&amp;nbsp;"host":&amp;nbsp;"&lt;A href="https://adb-xxxxxxxxxxxxxxx.x.azuredatabricks.net" target="_blank"&gt;https://adb-xxxxxxxxxxxxxxx.x.azuredatabricks.net&lt;/A&gt;",&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp;&amp;nbsp;"token":&amp;nbsp;"dapixxxxxxxxxxxxxxxxxxxxxxx-2",&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp;&amp;nbsp;"cluster_id":&amp;nbsp;"0119-xxxxxx-xxxxxxx",&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp;&amp;nbsp;"org_id":&amp;nbsp;"542xxxxxxxxxxx",&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp;&amp;nbsp;"port":&amp;nbsp;"15001"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;}&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;So I can not understand why it is working everywhere expect compute cluster with the same configuration of python code and OpenJDK/python version.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 30 Jan 2024 10:20:37 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/can-not-connect-to-databricks-on-azure-machine-learning-compute/m-p/58674#M31236</guid>
      <dc:creator>Etyr</dc:creator>
      <dc:date>2024-01-30T10:20:37Z</dc:date>
    </item>
    <item>
      <title>Re: Can not connect to databricks on Azure Machine Learning Compute Cluster.</title>
      <link>https://community.databricks.com/t5/data-engineering/can-not-connect-to-databricks-on-azure-machine-learning-compute/m-p/58680#M31238</link>
      <description>&lt;P&gt;Additional information I forgot to write.&lt;/P&gt;&lt;P&gt;Compute Instance has a User managed Identity in Azure, a Service Principal access is created in databricks with its Application ID. Same with the compute cluster, it has its own User Managed Identity that is also a SP in Databricks.&lt;/P&gt;&lt;P&gt;Both of them have the correct roles/rights to access clusters.&lt;/P&gt;&lt;P&gt;In local, I do "az login" to get my personal user, which also in databricks but has user.&lt;/P&gt;&lt;P&gt;In the compute cluster, I outputed the .databricks-connect file, to put it in my local computer. And tried to run the python code and it worked. I checked the Cluster logs, and I was using the Computer Cluster Service Principal which is the managed identity on Azure. So The Service Principal has the correct rights.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Etyr_0-1706612065861.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/6020iBBA0675A6CF7E3E3/image-size/medium/is-moderation-mode/true?v=v2&amp;amp;px=400" role="button" title="Etyr_0-1706612065861.png" alt="Etyr_0-1706612065861.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 30 Jan 2024 10:54:32 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/can-not-connect-to-databricks-on-azure-machine-learning-compute/m-p/58680#M31238</guid>
      <dc:creator>Etyr</dc:creator>
      <dc:date>2024-01-30T10:54:32Z</dc:date>
    </item>
    <item>
      <title>Re: Can not connect to databricks on Azure Machine Learning Compute Cluster.</title>
      <link>https://community.databricks.com/t5/data-engineering/can-not-connect-to-databricks-on-azure-machine-learning-compute/m-p/58710#M31244</link>
      <description>&lt;P&gt;I managed to recreate the error in compute instance, by deleting the /etc/hosts file.&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;Exception in thread "main" java.lang.ExceptionInInitializerError
        at org.apache.spark.deploy.SparkSubmitArguments.$anonfun$loadEnvironmentArguments$5(SparkSubmitArguments.scala:163)
        at scala.Option.orElse(Option.scala:447)
        at org.apache.spark.deploy.SparkSubmitArguments.loadEnvironmentArguments(SparkSubmitArguments.scala:163)
        at org.apache.spark.deploy.SparkSubmitArguments.&amp;lt;init&amp;gt;(SparkSubmitArguments.scala:118)
        at org.apache.spark.deploy.SparkSubmit$$anon$2$$anon$3.&amp;lt;init&amp;gt;(SparkSubmit.scala:1046)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.parseArguments(SparkSubmit.scala:1046)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:85)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1063)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1072)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.net.UnknownHostException: my-compute-instance: my-compute-instance: Name or service not known
        at java.net.InetAddress.getLocalHost(InetAddress.java:1432)
        at org.apache.spark.util.Utils$.findLocalInetAddress(Utils.scala:1211)
        at org.apache.spark.util.Utils$.localIpAddress$lzycompute(Utils.scala:1204)
        at org.apache.spark.util.Utils$.localIpAddress(Utils.scala:1204)
        at org.apache.spark.util.Utils$.$anonfun$localCanonicalHostName$1(Utils.scala:1261)
        at scala.Option.getOrElse(Option.scala:189)
        at org.apache.spark.util.Utils$.localCanonicalHostName(Utils.scala:1261)
        at org.apache.spark.internal.config.package$.&amp;lt;init&amp;gt;(package.scala:1080)
        at org.apache.spark.internal.config.package$.&amp;lt;clinit&amp;gt;(package.scala)
        ... 10 more
Caused by: java.net.UnknownHostException: my-compute-instance: Name or service not known
        at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
        at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:867)
        at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1302)
        at java.net.InetAddress$NameServiceAddresses.get(InetAddress.java:815)
        at java.net.InetAddress.getAllByName0(InetAddress.java:1291)
        at java.net.InetAddress.getLocalHost(InetAddress.java:1427)
        ... 18 more&lt;/LI-CODE&gt;&lt;P&gt;Here is the file in the compute instance:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;127.0.0.1 localhost my-compute-instance my-compute-instance my-compute-instance my-compute-instance my-compute-instance my-compute-instance my-compute-instance my-compute-instance my-compute-instance my-compute-instance my-compute-instance my-compute-instance my-compute-instance my-compute-instance

# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts&lt;/LI-CODE&gt;&lt;P&gt;Here is the hosts file in compute clusters:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;127.0.0.1 localhost

# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts&lt;/LI-CODE&gt;</description>
      <pubDate>Tue, 30 Jan 2024 14:25:45 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/can-not-connect-to-databricks-on-azure-machine-learning-compute/m-p/58710#M31244</guid>
      <dc:creator>Etyr</dc:creator>
      <dc:date>2024-01-30T14:25:45Z</dc:date>
    </item>
  </channel>
</rss>

