<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Notebook is stuck and cluster goes into waiting state while using spark libraries in Administration &amp; Architecture</title>
    <link>https://community.databricks.com/t5/administration-architecture/notebook-is-stuck-and-cluster-goes-into-waiting-state-while/m-p/89608#M1767</link>
    <description>&lt;P&gt;I don't think it is included in the runtime 14.3. I tried running my notebook without installing the library and it fails straightaway because the library has not been installed.&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have opened the firewall rules using service tags on port 443 but still it doesn't help.&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 12 Sep 2024 11:36:11 GMT</pubDate>
    <dc:creator>BhawaniD</dc:creator>
    <dc:date>2024-09-12T11:36:11Z</dc:date>
    <item>
      <title>Notebook is stuck and cluster goes into waiting state while using spark libraries</title>
      <link>https://community.databricks.com/t5/administration-architecture/notebook-is-stuck-and-cluster-goes-into-waiting-state-while/m-p/89595#M1764</link>
      <description>&lt;P&gt;Hey,&lt;/P&gt;&lt;P&gt;We have installed the com.databricks:spark-xml_2.12:0.18.0 library in our VNET-injected Databricks workspace to read XML files from a storage account. The notebook runs successfully for text files when the cluster is started without the library installed. However, when running the notebook with XML files, the cluster enters a waiting state.&lt;/P&gt;&lt;P&gt;Our Databricks subnet has a route table attached, and all traffic is routed through our firewall. When we disassociate the route table from the Databricks public subnet, the notebook runs without any issues, indicating that the firewall is blocking the required connectivity. However, I am unable to determine which ports or FQDNs need to be opened to resolve this issue.&amp;nbsp;&lt;/P&gt;&lt;P&gt;I would greatly appreciate any thoughts on this!&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 12 Sep 2024 10:21:28 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/notebook-is-stuck-and-cluster-goes-into-waiting-state-while/m-p/89595#M1764</guid>
      <dc:creator>BhawaniD</dc:creator>
      <dc:date>2024-09-12T10:21:28Z</dc:date>
    </item>
    <item>
      <title>Re: Notebook is stuck and cluster goes into waiting state while using spark libraries</title>
      <link>https://community.databricks.com/t5/administration-architecture/notebook-is-stuck-and-cluster-goes-into-waiting-state-while/m-p/89604#M1766</link>
      <description>&lt;P&gt;Since it's a maven dependency it should be simply HTTP and port 80/443.&lt;/P&gt;&lt;P&gt;Besides, are you aware that native XML support is included since &lt;A href="https://docs.databricks.com/en/release-notes/runtime/14.3lts.html#native-xml-file-format-support-public-preview" target="_self"&gt;runtime 14.3&lt;/A&gt;? This replaces the library spark-xml.&lt;/P&gt;</description>
      <pubDate>Thu, 12 Sep 2024 11:13:31 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/notebook-is-stuck-and-cluster-goes-into-waiting-state-while/m-p/89604#M1766</guid>
      <dc:creator>Witold</dc:creator>
      <dc:date>2024-09-12T11:13:31Z</dc:date>
    </item>
    <item>
      <title>Re: Notebook is stuck and cluster goes into waiting state while using spark libraries</title>
      <link>https://community.databricks.com/t5/administration-architecture/notebook-is-stuck-and-cluster-goes-into-waiting-state-while/m-p/89608#M1767</link>
      <description>&lt;P&gt;I don't think it is included in the runtime 14.3. I tried running my notebook without installing the library and it fails straightaway because the library has not been installed.&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have opened the firewall rules using service tags on port 443 but still it doesn't help.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 12 Sep 2024 11:36:11 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/notebook-is-stuck-and-cluster-goes-into-waiting-state-while/m-p/89608#M1767</guid>
      <dc:creator>BhawaniD</dc:creator>
      <dc:date>2024-09-12T11:36:11Z</dc:date>
    </item>
    <item>
      <title>Re: Notebook is stuck and cluster goes into waiting state while using spark libraries</title>
      <link>https://community.databricks.com/t5/administration-architecture/notebook-is-stuck-and-cluster-goes-into-waiting-state-while/m-p/89616#M1768</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;I don't think it is included in the runtime 14.3.&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;You are wrong here, native XML support is indeed included. Please check the &lt;A href="https://docs.databricks.com/en/query/formats/xml.html" target="_self"&gt;documentation&lt;/A&gt; how to use it properly, as there might be slight differences to spark-xml. The reason why it's included is that spark-xml becomes obsolete as it will be part of Spark 4. In Databricks we can use it today starting from version 14.3&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;BLOCKQUOTE&gt;&lt;HR /&gt;I have opened the firewall rules using service tags on port 443 but still it doesn't help.&amp;nbsp;&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;You might want to consult your network colleagues to configure it properly. Usually maven downloads the libraries e.g. from &lt;A href="https://repo1.maven.org/maven2/com/databricks/spark-xml_2.12/0.18.0/" target="_self"&gt;here&lt;/A&gt; (or one of the mirrors)&lt;/P&gt;</description>
      <pubDate>Thu, 12 Sep 2024 12:18:53 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/notebook-is-stuck-and-cluster-goes-into-waiting-state-while/m-p/89616#M1768</guid>
      <dc:creator>Witold</dc:creator>
      <dc:date>2024-09-12T12:18:53Z</dc:date>
    </item>
  </channel>
</rss>

