<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Serverless Compute connectivity issues with .com.br domains vs. Classic Clusters Spark hangs in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/serverless-compute-connectivity-issues-with-com-br-domains-vs/m-p/157005#M54497</link>
    <description>&lt;P class=""&gt;&lt;STRONG&gt;Hi there,&lt;/STRONG&gt;&lt;/P&gt;&lt;P class=""&gt;Great breakdown of the symptoms — these are actually two distinct issues likely sharing a common root cause in your VPC/network configuration. Let me address both:&lt;/P&gt;&lt;HR /&gt;&lt;H3&gt;Issue 1: Serverless Compute — .com.br DNS Resolution Failure&lt;/H3&gt;&lt;H4&gt;Root Cause&lt;/H4&gt;&lt;P class=""&gt;Serverless compute in Databricks &lt;STRONG&gt;does NOT run inside your custom VPC&lt;/STRONG&gt;. It runs in a &lt;STRONG&gt;Databricks-managed network&lt;/STRONG&gt; and egresses through Databricks' own infrastructure. This means:&lt;/P&gt;&lt;UL class=""&gt;&lt;LI&gt;Your VPC's outbound Security Group rules (0.0.0.0/0 on port 443) &lt;STRONG&gt;do not apply&lt;/STRONG&gt; to Serverless&lt;/LI&gt;&lt;LI&gt;Serverless traffic goes through &lt;STRONG&gt;Databricks-controlled egress&lt;/STRONG&gt;, which may have its own DNS resolvers and egress filtering&lt;/LI&gt;&lt;LI&gt;.com.br TLD resolution can fail if the managed DNS used by Serverless doesn't properly resolve &lt;STRONG&gt;country-code TLDs (ccTLDs)&lt;/STRONG&gt; or if those domains are not on Databricks' egress allowlist&lt;/LI&gt;&lt;/UL&gt;&lt;H4&gt;Fix for Serverless Connectivity&lt;/H4&gt;&lt;P class=""&gt;&lt;STRONG&gt;Option 1 — Use Serverless Network Policies (Recommended)&lt;/STRONG&gt; Databricks introduced &lt;STRONG&gt;Serverless Network Policies&lt;/STRONG&gt; to control egress from Serverless compute. You need to explicitly allow the .com.br destinations:&lt;/P&gt;&lt;UL class=""&gt;&lt;LI&gt;Go to &lt;STRONG&gt;Account Console → Network → Serverless Network Policies&lt;/STRONG&gt;&lt;/LI&gt;&lt;LI&gt;Add an egress policy that explicitly allows the target .com.br domains/IPs&lt;/LI&gt;&lt;LI&gt;This is the &lt;STRONG&gt;correct and supported&lt;/STRONG&gt; way to control Serverless egress — Security Groups alone won't work&lt;/LI&gt;&lt;/UL&gt;&lt;P class=""&gt;&lt;STRONG&gt;Option 2 — Contact Databricks Support&lt;/STRONG&gt; If the .com.br domains are being blocked at the Databricks-managed egress layer (not your VPC), you'll need Support to confirm whether those ccTLDs are filtered and to whitelist them at the platform level for your workspace in sa-east-1.&lt;/P&gt;&lt;P class=""&gt;&lt;STRONG&gt;Option 3 — Verify DNS explicitly&lt;/STRONG&gt; In a Serverless notebook, run:&lt;/P&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV class=""&gt;python&lt;/DIV&gt;&lt;DIV class=""&gt;&lt;PRE&gt;&lt;SPAN&gt;&lt;SPAN class=""&gt;import&lt;/SPAN&gt; socket&lt;/SPAN&gt;&lt;SPAN&gt;&lt;SPAN class=""&gt;try&lt;/SPAN&gt;&lt;SPAN class=""&gt;:&lt;/SPAN&gt;
&lt;/SPAN&gt;&lt;SPAN&gt;    &lt;SPAN class=""&gt;print&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;/SPAN&gt;socket&lt;SPAN class=""&gt;.&lt;/SPAN&gt;getaddrinfo&lt;SPAN class=""&gt;(&lt;/SPAN&gt;&lt;SPAN class=""&gt;"yourtarget.com.br"&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN class=""&gt;443&lt;/SPAN&gt;&lt;SPAN class=""&gt;)&lt;/SPAN&gt;&lt;SPAN class=""&gt;)&lt;/SPAN&gt;
&lt;/SPAN&gt;&lt;SPAN&gt;&lt;SPAN class=""&gt;except&lt;/SPAN&gt; Exception &lt;SPAN class=""&gt;as&lt;/SPAN&gt; e&lt;SPAN class=""&gt;:&lt;/SPAN&gt;
&lt;/SPAN&gt;&lt;SPAN&gt;    &lt;SPAN class=""&gt;print&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;/SPAN&gt;&lt;SPAN class=""&gt;f"DNS failed: &lt;/SPAN&gt;&lt;SPAN class=""&gt;{&lt;/SPAN&gt;&lt;SPAN class=""&gt;e&lt;/SPAN&gt;&lt;SPAN class=""&gt;}&lt;/SPAN&gt;&lt;SPAN class=""&gt;"&lt;/SPAN&gt;&lt;SPAN class=""&gt;)&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/PRE&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P class=""&gt;This confirms whether it's a &lt;STRONG&gt;DNS resolution failure&lt;/STRONG&gt; vs. a &lt;STRONG&gt;TCP/TLS connection block&lt;/STRONG&gt; — important distinction for Support.&lt;/P&gt;&lt;HR /&gt;&lt;H3&gt;Issue 2: Classic Cluster — Spark Hanging Indefinitely&lt;/H3&gt;&lt;H4&gt;Root Cause&lt;/H4&gt;&lt;P class=""&gt;Classic clusters &lt;STRONG&gt;do run inside your VPC&lt;/STRONG&gt;, so this is almost certainly a &lt;STRONG&gt;VPC networking/configuration problem&lt;/STRONG&gt;. A Spark job hanging without starting (not failing — just hanging) typically points to:&lt;/P&gt;&lt;DIV class=""&gt;Cause Explanation &lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;&lt;STRONG&gt;Driver ↔ Executor communication blocked&lt;/STRONG&gt;&lt;/TD&gt;&lt;TD&gt;Security Groups may block internal cluster traffic on required ports&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;STRONG&gt;S3 / Metastore connectivity issue&lt;/STRONG&gt;&lt;/TD&gt;&lt;TD&gt;Unity Catalog metastore or S3 access is blocked, causing Spark context init to stall&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;STRONG&gt;Missing VPC Endpoints&lt;/STRONG&gt;&lt;/TD&gt;&lt;TD&gt;Required AWS endpoints (S3, STS, KMS) may be missing, causing timeouts&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;STRONG&gt;DNS resolution failure inside VPC&lt;/STRONG&gt;&lt;/TD&gt;&lt;TD&gt;Custom VPC may have DNS hostnames/resolution not enabled&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/DIV&gt;&lt;H4&gt;Fix for Classic Cluster Spark Hang&lt;/H4&gt;&lt;P class=""&gt;&lt;STRONG&gt;Step 1 — Check VPC DNS Settings (Most Common Fix)&lt;/STRONG&gt;&lt;/P&gt;&lt;P class=""&gt;In AWS Console → Your VPC → Actions:&lt;/P&gt;&lt;UL class=""&gt;&lt;LI&gt;&lt;STRONG&gt;Enable DNS hostnames&lt;/STRONG&gt; → must be Yes&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Enable DNS resolution&lt;/STRONG&gt; → must be Yes&lt;/LI&gt;&lt;/UL&gt;&lt;P class=""&gt;If either is disabled, Spark nodes can't resolve each other or AWS service endpoints — causing silent hangs.&lt;/P&gt;&lt;P class=""&gt;&lt;STRONG&gt;Step 2 — Verify Security Group Inbound Rules for Internal Traffic&lt;/STRONG&gt;&lt;/P&gt;&lt;P class=""&gt;Databricks Classic Clusters require &lt;STRONG&gt;self-referencing inbound rules&lt;/STRONG&gt; in the Security Group:&lt;/P&gt;&lt;DIV class=""&gt;Type Protocol Port Range Source &lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;All TCP&lt;/TD&gt;&lt;TD&gt;TCP&lt;/TD&gt;&lt;TD&gt;0–65535&lt;/TD&gt;&lt;TD&gt;Same Security Group ID&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;All UDP&lt;/TD&gt;&lt;TD&gt;UDP&lt;/TD&gt;&lt;TD&gt;0–65535&lt;/TD&gt;&lt;TD&gt;Same Security Group ID&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/DIV&gt;&lt;P class=""&gt;Without this, Driver and Executor nodes can't communicate — Spark will silently hang.&lt;/P&gt;&lt;P class=""&gt;&lt;STRONG&gt;Step 3 — Verify Required VPC Endpoints Exist&lt;/STRONG&gt;&lt;/P&gt;&lt;P class=""&gt;For Unity Catalog + AWS in a custom VPC, these endpoints are strongly recommended:&lt;/P&gt;&lt;UL class=""&gt;&lt;LI&gt;com.amazonaws.sa-east-1.s3 (Gateway type)&lt;/LI&gt;&lt;LI&gt;com.amazonaws.sa-east-1.sts&lt;/LI&gt;&lt;LI&gt;com.amazonaws.sa-east-1.kinesis-streams (if using streaming)&lt;/LI&gt;&lt;/UL&gt;&lt;P class=""&gt;Missing S3 or STS endpoints in a private subnet will cause Spark to stall during initialization.&lt;/P&gt;&lt;P class=""&gt;&lt;STRONG&gt;Step 4 — Check Cluster Event Logs&lt;/STRONG&gt; In the Databricks UI → Cluster → &lt;STRONG&gt;Event Log&lt;/STRONG&gt; tab, look for timeout or unreachable host errors that may not surface in the notebook itself.&lt;/P&gt;&lt;HR /&gt;&lt;H3&gt;Likely Common Root Cause&lt;/H3&gt;&lt;P class=""&gt;Both issues in a &lt;STRONG&gt;custom VPC in sa-east-1&lt;/STRONG&gt; point to an &lt;STRONG&gt;incomplete network configuration&lt;/STRONG&gt;:&lt;/P&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV class=""&gt;&lt;PRE&gt;&lt;SPAN&gt;Serverless Issue  → VPC rules don't apply; need Serverless Network Policy for .com.br&lt;/SPAN&gt;&lt;SPAN&gt;Classic Hang      → Missing self-referencing SG rules OR DNS not enabled in VPC&lt;/SPAN&gt;&lt;/PRE&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P class=""&gt;&lt;STRONG&gt;Recommended action order:&lt;/STRONG&gt;&lt;/P&gt;&lt;OL class=""&gt;&lt;LI&gt;Fix VPC DNS settings first&lt;/LI&gt;&lt;LI&gt;Add self-referencing Security Group rules&lt;/LI&gt;&lt;LI&gt;Add missing VPC Endpoints&lt;/LI&gt;&lt;LI&gt;Add Serverless Network Policy for .com.br egress&lt;/LI&gt;&lt;LI&gt;If Serverless still fails, open a Support ticket with the DNS test result above&lt;/LI&gt;&lt;/OL&gt;&lt;HR /&gt;&lt;P class=""&gt;Hope this helps unblock both issues! what is the cluster Event Log shows — that'll help narrow down the Classic Cluster hang further.&lt;/P&gt;</description>
    <pubDate>Fri, 15 May 2026 15:39:46 GMT</pubDate>
    <dc:creator>GaneshI</dc:creator>
    <dc:date>2026-05-15T15:39:46Z</dc:date>
    <item>
      <title>Serverless Compute connectivity issues with .com.br domains vs. Classic Clusters Spark hangs</title>
      <link>https://community.databricks.com/t5/data-engineering/serverless-compute-connectivity-issues-with-com-br-domains-vs/m-p/156856#M54486</link>
      <description>&lt;P&gt;Hi everyone,&lt;/P&gt;&lt;P&gt;I'm facing two specific issues in my Databricks Premium workspace (AWS - sa-east-1).&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;&lt;P&gt;&lt;STRONG&gt;Serverless Connectivity Issue:&lt;/STRONG&gt; When using Serverless compute, I can successfully call APIs ending in .com, but calls to .com.br domains fail with connection/DNS errors. The exact same code works fine when running on a Classic Cluster.&lt;/P&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;UL&gt;&lt;LI&gt;&lt;P&gt;&lt;STRONG&gt;VPC Setup:&lt;/STRONG&gt; Custom VPC with Unity Catalog enabled.&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;&lt;STRONG&gt;Security Groups:&lt;/STRONG&gt; Outbound rules are open for port 443 (0.0.0.0/0).&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;&lt;STRONG&gt;Symptom:&lt;/STRONG&gt; It feels like a DNS resolution or Egress filtering issue specific to Serverless.&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;OL&gt;&lt;LI&gt;&lt;P&gt;&lt;STRONG&gt;Classic Cluster Spark Hang:&lt;/STRONG&gt; On the other hand, when I switch to a Classic Cluster to bypass the connectivity issue, any Spark command (e.g., spark.read or simple transformations) hangs indefinitely without starting the job.&lt;/P&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;Has anyone experienced this specific behavior where Serverless ignores certain TLDs or where Spark fails to initialize on Classic Clusters in the same VPC?&lt;/P&gt;&lt;P&gt;Thanks in advance!&lt;BR /&gt;&lt;BR /&gt;(pt-br)&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Olá pessoal,&lt;/P&gt;&lt;P&gt;Estou enfrentando dois problemas distintos no meu workspace Premium (AWS - região sa-east-1):&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;&lt;P&gt;&lt;STRONG&gt;Conectividade no Serverless:&lt;/STRONG&gt; Não consigo consumir APIs que terminam em .com.br usando Serverless compute. Se a API for .com, funciona normalmente. O mesmo código funciona em um Cluster Clássico, o que sugere que o Serverless está lidando com o DNS ou com a saída de rede de forma diferente.&lt;/P&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;UL&gt;&lt;LI&gt;&lt;P&gt;Já verifiquei os Security Groups e a porta 443 está aberta para 0.0.0.0/0.&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;OL&gt;&lt;LI&gt;&lt;P&gt;&lt;STRONG&gt;Spark "carregando infinitamente" no Cluster:&lt;/STRONG&gt; Para contornar o problema acima, tentei usar um Cluster comum. O código de requisição API funciona, mas qualquer comando Spark (como ler um dataframe ou um simples count) fica processando infinitamente e não inicia o job.&lt;/P&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;Alguém já passou por algo parecido ou sabe se existe alguma configuração de VPC/Unity Catalog que possa estar causando esse conflito entre o tipo de computação e a resolução de domínios?&lt;/P&gt;&lt;P&gt;Obrigado!&lt;/P&gt;</description>
      <pubDate>Wed, 13 May 2026 19:12:59 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/serverless-compute-connectivity-issues-with-com-br-domains-vs/m-p/156856#M54486</guid>
      <dc:creator>ThiagoRosetti</dc:creator>
      <dc:date>2026-05-13T19:12:59Z</dc:date>
    </item>
    <item>
      <title>Re: Serverless Compute connectivity issues with .com.br domains vs. Classic Clusters Spark hangs</title>
      <link>https://community.databricks.com/t5/data-engineering/serverless-compute-connectivity-issues-with-com-br-domains-vs/m-p/157005#M54497</link>
      <description>&lt;P class=""&gt;&lt;STRONG&gt;Hi there,&lt;/STRONG&gt;&lt;/P&gt;&lt;P class=""&gt;Great breakdown of the symptoms — these are actually two distinct issues likely sharing a common root cause in your VPC/network configuration. Let me address both:&lt;/P&gt;&lt;HR /&gt;&lt;H3&gt;Issue 1: Serverless Compute — .com.br DNS Resolution Failure&lt;/H3&gt;&lt;H4&gt;Root Cause&lt;/H4&gt;&lt;P class=""&gt;Serverless compute in Databricks &lt;STRONG&gt;does NOT run inside your custom VPC&lt;/STRONG&gt;. It runs in a &lt;STRONG&gt;Databricks-managed network&lt;/STRONG&gt; and egresses through Databricks' own infrastructure. This means:&lt;/P&gt;&lt;UL class=""&gt;&lt;LI&gt;Your VPC's outbound Security Group rules (0.0.0.0/0 on port 443) &lt;STRONG&gt;do not apply&lt;/STRONG&gt; to Serverless&lt;/LI&gt;&lt;LI&gt;Serverless traffic goes through &lt;STRONG&gt;Databricks-controlled egress&lt;/STRONG&gt;, which may have its own DNS resolvers and egress filtering&lt;/LI&gt;&lt;LI&gt;.com.br TLD resolution can fail if the managed DNS used by Serverless doesn't properly resolve &lt;STRONG&gt;country-code TLDs (ccTLDs)&lt;/STRONG&gt; or if those domains are not on Databricks' egress allowlist&lt;/LI&gt;&lt;/UL&gt;&lt;H4&gt;Fix for Serverless Connectivity&lt;/H4&gt;&lt;P class=""&gt;&lt;STRONG&gt;Option 1 — Use Serverless Network Policies (Recommended)&lt;/STRONG&gt; Databricks introduced &lt;STRONG&gt;Serverless Network Policies&lt;/STRONG&gt; to control egress from Serverless compute. You need to explicitly allow the .com.br destinations:&lt;/P&gt;&lt;UL class=""&gt;&lt;LI&gt;Go to &lt;STRONG&gt;Account Console → Network → Serverless Network Policies&lt;/STRONG&gt;&lt;/LI&gt;&lt;LI&gt;Add an egress policy that explicitly allows the target .com.br domains/IPs&lt;/LI&gt;&lt;LI&gt;This is the &lt;STRONG&gt;correct and supported&lt;/STRONG&gt; way to control Serverless egress — Security Groups alone won't work&lt;/LI&gt;&lt;/UL&gt;&lt;P class=""&gt;&lt;STRONG&gt;Option 2 — Contact Databricks Support&lt;/STRONG&gt; If the .com.br domains are being blocked at the Databricks-managed egress layer (not your VPC), you'll need Support to confirm whether those ccTLDs are filtered and to whitelist them at the platform level for your workspace in sa-east-1.&lt;/P&gt;&lt;P class=""&gt;&lt;STRONG&gt;Option 3 — Verify DNS explicitly&lt;/STRONG&gt; In a Serverless notebook, run:&lt;/P&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV class=""&gt;python&lt;/DIV&gt;&lt;DIV class=""&gt;&lt;PRE&gt;&lt;SPAN&gt;&lt;SPAN class=""&gt;import&lt;/SPAN&gt; socket&lt;/SPAN&gt;&lt;SPAN&gt;&lt;SPAN class=""&gt;try&lt;/SPAN&gt;&lt;SPAN class=""&gt;:&lt;/SPAN&gt;
&lt;/SPAN&gt;&lt;SPAN&gt;    &lt;SPAN class=""&gt;print&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;/SPAN&gt;socket&lt;SPAN class=""&gt;.&lt;/SPAN&gt;getaddrinfo&lt;SPAN class=""&gt;(&lt;/SPAN&gt;&lt;SPAN class=""&gt;"yourtarget.com.br"&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN class=""&gt;443&lt;/SPAN&gt;&lt;SPAN class=""&gt;)&lt;/SPAN&gt;&lt;SPAN class=""&gt;)&lt;/SPAN&gt;
&lt;/SPAN&gt;&lt;SPAN&gt;&lt;SPAN class=""&gt;except&lt;/SPAN&gt; Exception &lt;SPAN class=""&gt;as&lt;/SPAN&gt; e&lt;SPAN class=""&gt;:&lt;/SPAN&gt;
&lt;/SPAN&gt;&lt;SPAN&gt;    &lt;SPAN class=""&gt;print&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;/SPAN&gt;&lt;SPAN class=""&gt;f"DNS failed: &lt;/SPAN&gt;&lt;SPAN class=""&gt;{&lt;/SPAN&gt;&lt;SPAN class=""&gt;e&lt;/SPAN&gt;&lt;SPAN class=""&gt;}&lt;/SPAN&gt;&lt;SPAN class=""&gt;"&lt;/SPAN&gt;&lt;SPAN class=""&gt;)&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/PRE&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P class=""&gt;This confirms whether it's a &lt;STRONG&gt;DNS resolution failure&lt;/STRONG&gt; vs. a &lt;STRONG&gt;TCP/TLS connection block&lt;/STRONG&gt; — important distinction for Support.&lt;/P&gt;&lt;HR /&gt;&lt;H3&gt;Issue 2: Classic Cluster — Spark Hanging Indefinitely&lt;/H3&gt;&lt;H4&gt;Root Cause&lt;/H4&gt;&lt;P class=""&gt;Classic clusters &lt;STRONG&gt;do run inside your VPC&lt;/STRONG&gt;, so this is almost certainly a &lt;STRONG&gt;VPC networking/configuration problem&lt;/STRONG&gt;. A Spark job hanging without starting (not failing — just hanging) typically points to:&lt;/P&gt;&lt;DIV class=""&gt;Cause Explanation &lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;&lt;STRONG&gt;Driver ↔ Executor communication blocked&lt;/STRONG&gt;&lt;/TD&gt;&lt;TD&gt;Security Groups may block internal cluster traffic on required ports&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;STRONG&gt;S3 / Metastore connectivity issue&lt;/STRONG&gt;&lt;/TD&gt;&lt;TD&gt;Unity Catalog metastore or S3 access is blocked, causing Spark context init to stall&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;STRONG&gt;Missing VPC Endpoints&lt;/STRONG&gt;&lt;/TD&gt;&lt;TD&gt;Required AWS endpoints (S3, STS, KMS) may be missing, causing timeouts&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;STRONG&gt;DNS resolution failure inside VPC&lt;/STRONG&gt;&lt;/TD&gt;&lt;TD&gt;Custom VPC may have DNS hostnames/resolution not enabled&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/DIV&gt;&lt;H4&gt;Fix for Classic Cluster Spark Hang&lt;/H4&gt;&lt;P class=""&gt;&lt;STRONG&gt;Step 1 — Check VPC DNS Settings (Most Common Fix)&lt;/STRONG&gt;&lt;/P&gt;&lt;P class=""&gt;In AWS Console → Your VPC → Actions:&lt;/P&gt;&lt;UL class=""&gt;&lt;LI&gt;&lt;STRONG&gt;Enable DNS hostnames&lt;/STRONG&gt; → must be Yes&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Enable DNS resolution&lt;/STRONG&gt; → must be Yes&lt;/LI&gt;&lt;/UL&gt;&lt;P class=""&gt;If either is disabled, Spark nodes can't resolve each other or AWS service endpoints — causing silent hangs.&lt;/P&gt;&lt;P class=""&gt;&lt;STRONG&gt;Step 2 — Verify Security Group Inbound Rules for Internal Traffic&lt;/STRONG&gt;&lt;/P&gt;&lt;P class=""&gt;Databricks Classic Clusters require &lt;STRONG&gt;self-referencing inbound rules&lt;/STRONG&gt; in the Security Group:&lt;/P&gt;&lt;DIV class=""&gt;Type Protocol Port Range Source &lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;All TCP&lt;/TD&gt;&lt;TD&gt;TCP&lt;/TD&gt;&lt;TD&gt;0–65535&lt;/TD&gt;&lt;TD&gt;Same Security Group ID&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;All UDP&lt;/TD&gt;&lt;TD&gt;UDP&lt;/TD&gt;&lt;TD&gt;0–65535&lt;/TD&gt;&lt;TD&gt;Same Security Group ID&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/DIV&gt;&lt;P class=""&gt;Without this, Driver and Executor nodes can't communicate — Spark will silently hang.&lt;/P&gt;&lt;P class=""&gt;&lt;STRONG&gt;Step 3 — Verify Required VPC Endpoints Exist&lt;/STRONG&gt;&lt;/P&gt;&lt;P class=""&gt;For Unity Catalog + AWS in a custom VPC, these endpoints are strongly recommended:&lt;/P&gt;&lt;UL class=""&gt;&lt;LI&gt;com.amazonaws.sa-east-1.s3 (Gateway type)&lt;/LI&gt;&lt;LI&gt;com.amazonaws.sa-east-1.sts&lt;/LI&gt;&lt;LI&gt;com.amazonaws.sa-east-1.kinesis-streams (if using streaming)&lt;/LI&gt;&lt;/UL&gt;&lt;P class=""&gt;Missing S3 or STS endpoints in a private subnet will cause Spark to stall during initialization.&lt;/P&gt;&lt;P class=""&gt;&lt;STRONG&gt;Step 4 — Check Cluster Event Logs&lt;/STRONG&gt; In the Databricks UI → Cluster → &lt;STRONG&gt;Event Log&lt;/STRONG&gt; tab, look for timeout or unreachable host errors that may not surface in the notebook itself.&lt;/P&gt;&lt;HR /&gt;&lt;H3&gt;Likely Common Root Cause&lt;/H3&gt;&lt;P class=""&gt;Both issues in a &lt;STRONG&gt;custom VPC in sa-east-1&lt;/STRONG&gt; point to an &lt;STRONG&gt;incomplete network configuration&lt;/STRONG&gt;:&lt;/P&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV class=""&gt;&lt;PRE&gt;&lt;SPAN&gt;Serverless Issue  → VPC rules don't apply; need Serverless Network Policy for .com.br&lt;/SPAN&gt;&lt;SPAN&gt;Classic Hang      → Missing self-referencing SG rules OR DNS not enabled in VPC&lt;/SPAN&gt;&lt;/PRE&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P class=""&gt;&lt;STRONG&gt;Recommended action order:&lt;/STRONG&gt;&lt;/P&gt;&lt;OL class=""&gt;&lt;LI&gt;Fix VPC DNS settings first&lt;/LI&gt;&lt;LI&gt;Add self-referencing Security Group rules&lt;/LI&gt;&lt;LI&gt;Add missing VPC Endpoints&lt;/LI&gt;&lt;LI&gt;Add Serverless Network Policy for .com.br egress&lt;/LI&gt;&lt;LI&gt;If Serverless still fails, open a Support ticket with the DNS test result above&lt;/LI&gt;&lt;/OL&gt;&lt;HR /&gt;&lt;P class=""&gt;Hope this helps unblock both issues! what is the cluster Event Log shows — that'll help narrow down the Classic Cluster hang further.&lt;/P&gt;</description>
      <pubDate>Fri, 15 May 2026 15:39:46 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/serverless-compute-connectivity-issues-with-com-br-domains-vs/m-p/157005#M54497</guid>
      <dc:creator>GaneshI</dc:creator>
      <dc:date>2026-05-15T15:39:46Z</dc:date>
    </item>
  </channel>
</rss>

