<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Connection reset error from Databricks notebook but works via curl (GCP) in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/connection-reset-error-from-databricks-notebook-but-works-via/m-p/153119#M53943</link>
    <description>&lt;P class=""&gt;&lt;SPAN&gt;Hi everyone,&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN&gt;I’m facing a connectivity issue in my Databricks workspace on GCP and would appreciate any guidance.&lt;/SPAN&gt;&lt;/P&gt;&lt;H3&gt;&lt;SPAN&gt;&amp;nbsp;Problem&lt;/SPAN&gt;&lt;/H3&gt;&lt;P class=""&gt;&lt;SPAN&gt;When I run commands from a Databricks notebook, I see intermittent errors like:&lt;/SPAN&gt;&lt;/P&gt;&lt;PRE&gt;&lt;SPAN&gt;Connection reset
Retrying request to https://us-east4.gcp.databricks.com:443&lt;/SPAN&gt;&lt;/PRE&gt;&lt;P class=""&gt;&lt;SPAN&gt;However, when I test connectivity manually from the cluster node using curl, it works fine.&lt;/SPAN&gt;&lt;/P&gt;&lt;DIV&gt;&lt;HR /&gt;&lt;/DIV&gt;&lt;H3&gt;&lt;SPAN&gt;&amp;nbsp;What I verified&lt;/SPAN&gt;&lt;/H3&gt;&lt;OL&gt;&lt;LI&gt;&lt;STRONG&gt;&lt;SPAN&gt;Direct connectivity works&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;PRE&gt;&lt;SPAN&gt;curl -v https://us-east4.gcp.databricks.com&lt;/SPAN&gt;&lt;/PRE&gt;&lt;UL&gt;&lt;LI&gt;&lt;SPAN&gt;Resolves to public IP (34.x.x.x)&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN&gt;TLS handshake successful&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN&gt;Returns HTTP 303 → /login.html&lt;/SPAN&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;OL&gt;&lt;LI&gt;&lt;STRONG&gt;&lt;SPAN&gt;DNS resolution is correct&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;PRE&gt;&lt;SPAN&gt;getent hosts us-east4.gcp.databricks.com
→ 34.128.x.x&lt;/SPAN&gt;&lt;/PRE&gt;&lt;OL&gt;&lt;LI&gt;&lt;STRONG&gt;&lt;SPAN&gt;Proxy removed&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;UL&gt;&lt;LI&gt;&lt;UL&gt;&lt;LI&gt;&lt;SPAN&gt;Removed HTTP_PROXY / HTTPS_PROXY environment variables&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN&gt;Verified no proxy is being used&lt;/SPAN&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;DIV&gt;&lt;HR /&gt;&lt;/DIV&gt;&lt;H3&gt;&lt;SPAN&gt;&amp;nbsp;Issue inside Databricks runtime&lt;/SPAN&gt;&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;&lt;SPAN&gt;Notebook / Spark jobs still show:&lt;/SPAN&gt;&lt;UL&gt;&lt;LI&gt;&lt;SPAN&gt;Connection reset&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN&gt;Retry attempts in logs&lt;/SPAN&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;DIV&gt;&lt;HR /&gt;&lt;/DIV&gt;&lt;H3&gt;&lt;SPAN&gt;Questions&lt;/SPAN&gt;&lt;/H3&gt;&lt;OL&gt;&lt;LI&gt;&lt;SPAN&gt;Is this expected behavior due to connection reuse / keep-alive in Databricks runtime?&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN&gt;Could this be related to JVM/Spark HTTP client behavior?&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN&gt;Are there recommended configurations to avoid these connection reset logs?&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN&gt;When should this be considered a real failure vs harmless retry?&lt;/SPAN&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;DIV&gt;&lt;HR /&gt;&lt;/DIV&gt;&lt;/LI&gt;&lt;/UL&gt;</description>
    <pubDate>Thu, 02 Apr 2026 23:37:49 GMT</pubDate>
    <dc:creator>abhishek13</dc:creator>
    <dc:date>2026-04-02T23:37:49Z</dc:date>
    <item>
      <title>Connection reset error from Databricks notebook but works via curl (GCP)</title>
      <link>https://community.databricks.com/t5/data-engineering/connection-reset-error-from-databricks-notebook-but-works-via/m-p/153119#M53943</link>
      <description>&lt;P class=""&gt;&lt;SPAN&gt;Hi everyone,&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN&gt;I’m facing a connectivity issue in my Databricks workspace on GCP and would appreciate any guidance.&lt;/SPAN&gt;&lt;/P&gt;&lt;H3&gt;&lt;SPAN&gt;&amp;nbsp;Problem&lt;/SPAN&gt;&lt;/H3&gt;&lt;P class=""&gt;&lt;SPAN&gt;When I run commands from a Databricks notebook, I see intermittent errors like:&lt;/SPAN&gt;&lt;/P&gt;&lt;PRE&gt;&lt;SPAN&gt;Connection reset
Retrying request to https://us-east4.gcp.databricks.com:443&lt;/SPAN&gt;&lt;/PRE&gt;&lt;P class=""&gt;&lt;SPAN&gt;However, when I test connectivity manually from the cluster node using curl, it works fine.&lt;/SPAN&gt;&lt;/P&gt;&lt;DIV&gt;&lt;HR /&gt;&lt;/DIV&gt;&lt;H3&gt;&lt;SPAN&gt;&amp;nbsp;What I verified&lt;/SPAN&gt;&lt;/H3&gt;&lt;OL&gt;&lt;LI&gt;&lt;STRONG&gt;&lt;SPAN&gt;Direct connectivity works&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;PRE&gt;&lt;SPAN&gt;curl -v https://us-east4.gcp.databricks.com&lt;/SPAN&gt;&lt;/PRE&gt;&lt;UL&gt;&lt;LI&gt;&lt;SPAN&gt;Resolves to public IP (34.x.x.x)&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN&gt;TLS handshake successful&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN&gt;Returns HTTP 303 → /login.html&lt;/SPAN&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;OL&gt;&lt;LI&gt;&lt;STRONG&gt;&lt;SPAN&gt;DNS resolution is correct&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;PRE&gt;&lt;SPAN&gt;getent hosts us-east4.gcp.databricks.com
→ 34.128.x.x&lt;/SPAN&gt;&lt;/PRE&gt;&lt;OL&gt;&lt;LI&gt;&lt;STRONG&gt;&lt;SPAN&gt;Proxy removed&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;UL&gt;&lt;LI&gt;&lt;UL&gt;&lt;LI&gt;&lt;SPAN&gt;Removed HTTP_PROXY / HTTPS_PROXY environment variables&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN&gt;Verified no proxy is being used&lt;/SPAN&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;DIV&gt;&lt;HR /&gt;&lt;/DIV&gt;&lt;H3&gt;&lt;SPAN&gt;&amp;nbsp;Issue inside Databricks runtime&lt;/SPAN&gt;&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;&lt;SPAN&gt;Notebook / Spark jobs still show:&lt;/SPAN&gt;&lt;UL&gt;&lt;LI&gt;&lt;SPAN&gt;Connection reset&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN&gt;Retry attempts in logs&lt;/SPAN&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;DIV&gt;&lt;HR /&gt;&lt;/DIV&gt;&lt;H3&gt;&lt;SPAN&gt;Questions&lt;/SPAN&gt;&lt;/H3&gt;&lt;OL&gt;&lt;LI&gt;&lt;SPAN&gt;Is this expected behavior due to connection reuse / keep-alive in Databricks runtime?&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN&gt;Could this be related to JVM/Spark HTTP client behavior?&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN&gt;Are there recommended configurations to avoid these connection reset logs?&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN&gt;When should this be considered a real failure vs harmless retry?&lt;/SPAN&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;DIV&gt;&lt;HR /&gt;&lt;/DIV&gt;&lt;/LI&gt;&lt;/UL&gt;</description>
      <pubDate>Thu, 02 Apr 2026 23:37:49 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/connection-reset-error-from-databricks-notebook-but-works-via/m-p/153119#M53943</guid>
      <dc:creator>abhishek13</dc:creator>
      <dc:date>2026-04-02T23:37:49Z</dc:date>
    </item>
    <item>
      <title>Re: Connection reset error from Databricks notebook but works via curl (GCP)</title>
      <link>https://community.databricks.com/t5/data-engineering/connection-reset-error-from-databricks-notebook-but-works-via/m-p/153204#M53945</link>
      <description>&lt;P&gt;can someone help on this&lt;/P&gt;</description>
      <pubDate>Fri, 03 Apr 2026 15:56:57 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/connection-reset-error-from-databricks-notebook-but-works-via/m-p/153204#M53945</guid>
      <dc:creator>abhishek13</dc:creator>
      <dc:date>2026-04-03T15:56:57Z</dc:date>
    </item>
    <item>
      <title>Re: Connection reset error from Databricks notebook but works via curl (GCP)</title>
      <link>https://community.databricks.com/t5/data-engineering/connection-reset-error-from-databricks-notebook-but-works-via/m-p/153216#M53946</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/225575"&gt;@abhishek13&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This is a classic JVM HTTP client vs. system curl discrepancy, and it's very common in Databricks on GCP.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;STRONG&gt;Why curl works but the notebook doesn't&lt;/STRONG&gt;&lt;BR /&gt;curl uses a fresh TCP connection each time. The Databricks runtime (and Spark internals) use persistent&lt;BR /&gt;connection pools — typically Apache HttpClient or OkHttp — which hold connections open across requests.&lt;BR /&gt;GCP's load balancers and firewalls have idle timeout policies (often 10 minutes on GCP, sometimes as low as 3–4 minutes&lt;BR /&gt;on internal paths), and they silently drop stale connections server-side. The client's connection pool doesn't know&lt;BR /&gt;the connection is dead until it tries to reuse it, which produces the Connection reset error on the first attempt.&lt;BR /&gt;The retry then opens a fresh socket, which succeeds — hence the "intermittent" pattern.&lt;/P&gt;&lt;P&gt;The single strongest heuristic: if the retry log line is immediately followed by a success log and the job completes,&lt;BR /&gt;it is harmless. If you see the same host repeatedly failing across multiple retries without recovery,&lt;BR /&gt;that warrants deeper investigation (firewall rule change, DNS flap, or a Databricks control plane issue).&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Recommended action plan&lt;/STRONG&gt;&lt;BR /&gt;Add the TCP keepalive init script — this is the lowest-risk, highest-impact fix and addresses the root cause at the OS level.&lt;BR /&gt;Set connection TTL &amp;lt; 540s in any HttpClient pools you control (GCP's effective idle timeout is ~600s but be conservative).&lt;BR /&gt;Monitor with Connection reset as a warning, not an error — alert only if retry count per 5-minute window exceeds a threshold (e.g., &amp;gt; 10).&lt;BR /&gt;If the errors persist after idle periods specifically, check whether your cluster is using serverless compute — serverless&lt;BR /&gt;has different network path characteristics on GCP and may need Databricks support involvement for persistent issues.&lt;/P&gt;&lt;P&gt;Hope this can help you,&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/225575"&gt;@abhishek13&lt;/a&gt;&amp;nbsp;.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 03 Apr 2026 17:38:10 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/connection-reset-error-from-databricks-notebook-but-works-via/m-p/153216#M53946</guid>
      <dc:creator>lingareddy_Alva</dc:creator>
      <dc:date>2026-04-03T17:38:10Z</dc:date>
    </item>
  </channel>
</rss>

