<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Long-running Python http POST hangs in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/long-running-python-http-post-hangs/m-p/140671#M51504</link>
    <description>&lt;P&gt;Hello,&lt;BR /&gt;IMHO, having a HTTP related task in a Spark cluster is an anti-pattern. This kind of code executes at the Driver, it will be synchronous and adds overhead. This is one of the reasons, DLT (or SDP - Spark Declarative Pipeline) does not have REST based tasks.&lt;BR /&gt;&lt;BR /&gt;Please review if this task can be done outside Databricks like below,&lt;BR /&gt;1) Event based trigger:&amp;nbsp;push the result from Databricks to cloud storage; and this creates an event (Event grid) to a listener like Function/Logic App that will perform HTTP task&lt;BR /&gt;2) Classic Poller: Azure Function App to check for an expectation every 'n' mins. if met; execute the HTTP task&lt;/P&gt;</description>
    <pubDate>Sun, 30 Nov 2025 21:18:14 GMT</pubDate>
    <dc:creator>siva-anantha</dc:creator>
    <dc:date>2025-11-30T21:18:14Z</dc:date>
    <item>
      <title>Long-running Python http POST hangs</title>
      <link>https://community.databricks.com/t5/data-engineering/long-running-python-http-post-hangs/m-p/140322#M51383</link>
      <description>&lt;P&gt;As one of the steps in my data engineering pipeline, I need to perform a POST request to a http (not -s) server.&lt;BR /&gt;This all works fine, except for the situation described below: it then hangs indefinitely.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Environment:&lt;/STRONG&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Azure Databricks Runtime 13.3 LTS&lt;/LI&gt;&lt;LI&gt;Python 3.10.12&lt;/LI&gt;&lt;LI&gt;Executing from a notebook&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;STRONG&gt;Scenario:&lt;/STRONG&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Some example upload of a (big) file.&lt;/LI&gt;&lt;/UL&gt;&lt;LI-CODE lang="python"&gt;headers = {"Content-Type": f"{mime_type}"}
chunk_size = 1024*1024
response = requests.post(
	destination_repo_url, 
	headers=headers,
	auth=auth,
	timeout=10,
	data=(chunk for chunk in iter(lambda: source_file.read(chunk_size), b"")))
response.raise_for_status()​&lt;/LI-CODE&gt;&lt;P&gt;(Obviously the timeout is to be chosen, but whatever we choose, behavior is identical)&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Example without file upload complexity:&lt;/LI&gt;&lt;/UL&gt;&lt;LI-CODE lang="markup"&gt;response = requests.post(
   destination_REST_URL_triggering_long_operation,
   auth=auth)​&lt;/LI-CODE&gt;&lt;P&gt;&lt;STRONG&gt;Expected behavior:&lt;/STRONG&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Operation takes the required time (e.g. 5 minutes), then completes.&lt;BR /&gt;The real transfer time is limited, the server's processing is long, but finally should complete without issues.&lt;/LI&gt;&lt;LI&gt;One would expect no difference in behaviour between a local Python and the one running on the Azure Databricks cluster, or a clear motivation why it would behave differently, and how to avoid it.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;STRONG&gt;Observed behavior:&lt;/STRONG&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Operation never ends on client side (server side completes as usual).&lt;/LI&gt;&lt;LI&gt;The provided timeout doesn't make any difference, while you would expect the read timeout to trigger since nothing is received from the server during its long postprocessing.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;STRONG&gt;Alternatives tried:&lt;/STRONG&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;curl as a subprocess works as expected.&lt;UL&gt;&lt;LI&gt;You see the upload taking place (in case of upload).&lt;/LI&gt;&lt;LI&gt;Then logs multiple lines with no transfer.&lt;/LI&gt;&lt;LI&gt;Then completes correctly after the server sends back its 204.&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;LI&gt;Trying from local system ( so not Databricks), you see using Wireshark:&lt;UL&gt;&lt;LI&gt;POST operation is invoked&lt;/LI&gt;&lt;LI&gt;Data is sent (in case of the file transfer) and completes&lt;/LI&gt;&lt;LI&gt;Server is postprocessing&lt;/LI&gt;&lt;LI&gt;If timeout &amp;lt; processing time&lt;UL&gt;&lt;LI&gt;Client gives up on ReadTimeout.&lt;BR /&gt;This is expected, normal behavior. To avoid this, set read timeout big enough.&lt;/LI&gt;&lt;LI&gt;Server continues its work and completes it correctly (but doesn't report it anymore because of connection closure by client).&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;LI&gt;Else&lt;UL&gt;&lt;LI&gt;Server replies the usual 204 once completed&lt;/LI&gt;&lt;LI&gt;Client completes the blocking post request normally.&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;LI&gt;I don't have the server's implementation under control, so I can't change that.&lt;/LI&gt;&lt;/UL&gt;</description>
      <pubDate>Tue, 25 Nov 2025 16:59:15 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/long-running-python-http-post-hangs/m-p/140322#M51383</guid>
      <dc:creator>Johan_Van_Noten</dc:creator>
      <dc:date>2025-11-25T16:59:15Z</dc:date>
    </item>
    <item>
      <title>Re: Long-running Python http POST hangs</title>
      <link>https://community.databricks.com/t5/data-engineering/long-running-python-http-post-hangs/m-p/140410#M51418</link>
      <description>&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;You are experiencing different behaviors running a long-running&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;requests.post()&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;operation in Azure Databricks (Python) versus running it locally. Locally, the&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;timeout&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;behaves as expected, but in Databricks the client “hangs indefinitely” even after server post-processing has completed and a response (204) is sent. However, alternatives like&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;curl&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;as a subprocess in Databricks work as expected.&lt;/P&gt;
&lt;H2 id="key-observations" class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0 md:text-lg [hr+&amp;amp;]:mt-4"&gt;Key Observations&lt;/H2&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;The&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;timeout&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;parameter in&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;requests.post()&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;behaves as a&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;connect and read timeout&lt;/STRONG&gt;. If the server doesn’t send&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;EM&gt;any&lt;/EM&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;bytes for longer than&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;timeout&lt;/CODE&gt;, a&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;ReadTimeout&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;should trigger.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;With Azure Databricks Runtime (ADR), Python’s networking stack might be subtly affected by the cluster’s managed environment (network/NAT-level buffering, virtualized proxies, or custom firewall policies).&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Your alternative test with&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;curl&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;works, which confirms the network route and server aren’t blocking the traffic.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;You see expected behavior running locally; only Databricks hangs indefinitely, even though the server completes successfully.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 id="potential-causes" class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0 md:text-lg [hr+&amp;amp;]:mt-4"&gt;Potential Causes&lt;/H2&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;&lt;STRONG&gt;Databricks network virtualization:&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;Databricks clusters often run in containers or on VMs with network proxies, which can interfere with low-level socket timeout detection by Python&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;requests&lt;/CODE&gt;.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;&lt;STRONG&gt;Requests library limitations:&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;In some environments (especially with HTTP/1.1 keepalives), the Python socket layer’s timeout detection can be bypassed if the underlying TCP connection is managed by an intermediary.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;&lt;STRONG&gt;No data transfer during server post-processing:&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;If the server sends no traffic (not even keepalive headers or HTTP chunked responses) during its post-processing, and intermediaries or the OS network stack buffer the connection, the&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;requests&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;library may not detect that the server is “silent” for longer than your timeout.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;&lt;STRONG&gt;Differences in HTTP stack between&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;requests&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;and&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;curl&lt;/CODE&gt;:&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;Curl might be handling TCP-level inactivity better and not being affected by any intermediate Databricks proxy as Python is.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 id="how-to-work-around-it" class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0 md:text-lg [hr+&amp;amp;]:mt-4"&gt;How to Work Around It&lt;/H2&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;1. Use&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;curl&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;via Subprocess&lt;/H2&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Since&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;curl&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;works reliably in your environment, consider making the HTTP request via Python's&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;subprocess&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;module, capturing the output as needed.&lt;/P&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;2. Explicitly Set&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;stream=True&lt;/CODE&gt;&lt;/H2&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Try setting&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;stream=True&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;in your&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;requests.post()&lt;/CODE&gt;. Then, read the response manually with a controlled timeout using lower-level socket timeouts.&lt;/P&gt;
&lt;DIV class="w-full md:max-w-[90vw]"&gt;
&lt;DIV class="codeWrapper text-light selection:text-super selection:bg-super/10 my-md relative flex flex-col rounded-lg font-mono text-sm font-normal bg-subtler"&gt;
&lt;DIV class="translate-y-xs -translate-x-xs bottom-xl mb-xl flex h-0 items-start justify-end md:sticky md:top-[calc(var(--header-height)+var(--size-xs))]"&gt;
&lt;DIV class="overflow-hidden rounded-full border-subtlest ring-subtlest divide-subtlest bg-base"&gt;
&lt;DIV class="border-subtlest ring-subtlest divide-subtlest bg-subtler"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV class="-mt-xl"&gt;
&lt;DIV&gt;
&lt;DIV class="text-quiet bg-subtle py-xs px-sm inline-block rounded-br rounded-tl-lg text-xs font-thin" data-testid="code-language-indicator"&gt;python&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;&lt;CODE&gt;response &lt;SPAN class="token token operator"&gt;=&lt;/SPAN&gt; requests&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;post&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;,&lt;/SPAN&gt; stream&lt;SPAN class="token token operator"&gt;=&lt;/SPAN&gt;&lt;SPAN class="token token boolean"&gt;True&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;,&lt;/SPAN&gt; timeout&lt;SPAN class="token token operator"&gt;=&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;connect_timeout&lt;SPAN class="token token punctuation"&gt;,&lt;/SPAN&gt; read_timeout&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;
&lt;SPAN class="token token"&gt;for&lt;/SPAN&gt; chunk &lt;SPAN class="token token"&gt;in&lt;/SPAN&gt; response&lt;SPAN class="token token punctuation"&gt;.&lt;/SPAN&gt;iter_content&lt;SPAN class="token token punctuation"&gt;(&lt;/SPAN&gt;chunk_size&lt;SPAN class="token token operator"&gt;=&lt;/SPAN&gt;&lt;SPAN class="token token"&gt;8192&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;,&lt;/SPAN&gt; decode_unicode&lt;SPAN class="token token operator"&gt;=&lt;/SPAN&gt;&lt;SPAN class="token token boolean"&gt;False&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token token punctuation"&gt;:&lt;/SPAN&gt;
    &lt;SPAN class="token token"&gt;# process chunk&lt;/SPAN&gt;
&lt;/CODE&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;But if the first byte from the server is delayed until post-processing is complete, this will not help.&lt;/P&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;3. Use Lower-Level HTTP Client&lt;/H2&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Try using&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;http.client&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;(stdlib) for more customizable socket-level handling.&lt;/P&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;4. Test with Different Databricks Runtimes&lt;/H2&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;If possible, test the same code on different runtime versions, or an ML cluster vs. a non-ML cluster.&lt;/P&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;5. Confirm Network Middleboxes&lt;/H2&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Check if Azure NSG rules or Databricks cluster network configuration involve proxies or load balancers. These might buffer idle connections differently between Python and system-level curl.&lt;/P&gt;
&lt;H2 class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0"&gt;6. Change Server Behavior (if possible)&lt;/H2&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Ask the server owner to occasionally send whitespace or HTTP/1.1&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;100-continue&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;interim responses. You mentioned you can't control the server; if that's final, focus on client workarounds above.&lt;/P&gt;
&lt;H2 id="why-the-difference" class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0 md:text-lg [hr+&amp;amp;]:mt-4"&gt;Why the Difference?&lt;/H2&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;The most probable cause is that Databricks’ network path or virtualization introduces a condition where Python's&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;requests&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;and underlying sockets do not get notified of a closed socket, or the network stack masks silence. Curl’s handling at the OS level might bypass this issue, or uses different buffer or keepalive logic.&lt;/P&gt;
&lt;H2 id="summary-table" class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0 md:text-lg [hr+&amp;amp;]:mt-4"&gt;Summary Table&lt;/H2&gt;
&lt;DIV class="group relative"&gt;
&lt;DIV class="w-full overflow-x-auto md:max-w-[90vw] border-subtlest ring-subtlest divide-subtlest bg-transparent"&gt;
&lt;TABLE class="border-subtler my-[1em] w-full table-auto border-separate border-spacing-0 border-l border-t"&gt;
&lt;THEAD class="bg-subtler"&gt;
&lt;TR&gt;
&lt;TH class="border-subtler p-sm break-normal border-b border-r text-left align-top"&gt;Approach&lt;/TH&gt;
&lt;TH class="border-subtler p-sm break-normal border-b border-r text-left align-top"&gt;Databricks Python&lt;/TH&gt;
&lt;TH class="border-subtler p-sm break-normal border-b border-r text-left align-top"&gt;Local Python&lt;/TH&gt;
&lt;TH class="border-subtler p-sm break-normal border-b border-r text-left align-top"&gt;Databricks curl&lt;/TH&gt;
&lt;TH class="border-subtler p-sm break-normal border-b border-r text-left align-top"&gt;Local curl&lt;/TH&gt;
&lt;/TR&gt;
&lt;/THEAD&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;requests.post(timeout=10)&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Hangs indefinitely&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Behaves&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;N/A&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;N/A&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;subprocess.run(['curl'])&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Works&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Works&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Works&lt;/TD&gt;
&lt;TD class="px-sm border-subtler min-w-[48px] break-normal border-b border-r"&gt;Works&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;/DIV&gt;
&lt;DIV class="bg-base border-subtler shadow-subtle pointer-coarse:opacity-100 right-xs absolute bottom-0 flex rounded-lg border opacity-0 transition-opacity group-hover:opacity-100 [&amp;amp;&amp;gt;*:not(:first-child)]:border-subtle [&amp;amp;&amp;gt;*:not(:first-child)]:border-l"&gt;
&lt;DIV class="flex"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="flex"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;H2 id="recommendation" class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0 md:text-lg [hr+&amp;amp;]:mt-4"&gt;Recommendation&lt;/H2&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;For robust production pipelines in Azure Databricks, use&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;curl&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;or similar library via subprocess if server silence and networking quirks are causing issues for Python&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;requests&lt;/CODE&gt;.&lt;/P&gt;</description>
      <pubDate>Wed, 26 Nov 2025 12:39:03 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/long-running-python-http-post-hangs/m-p/140410#M51418</guid>
      <dc:creator>mark_ott</dc:creator>
      <dc:date>2025-11-26T12:39:03Z</dc:date>
    </item>
    <item>
      <title>Re: Long-running Python http POST hangs</title>
      <link>https://community.databricks.com/t5/data-engineering/long-running-python-http-post-hangs/m-p/140598#M51479</link>
      <description>&lt;P&gt;Thanks for your quick and extensive reply.&lt;BR /&gt;Given that I don't have any administration rights on the Azure/Databricks environment and don't have the REST-server under control, some of the sensible suggestions are difficult.&lt;BR /&gt;I will work with IT to check the Azure/Databricks settings.&lt;BR /&gt;In the meantime I will keep using the curl workaround.&lt;/P&gt;</description>
      <pubDate>Fri, 28 Nov 2025 11:43:34 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/long-running-python-http-post-hangs/m-p/140598#M51479</guid>
      <dc:creator>Johan_Van_Noten</dc:creator>
      <dc:date>2025-11-28T11:43:34Z</dc:date>
    </item>
    <item>
      <title>Re: Long-running Python http POST hangs</title>
      <link>https://community.databricks.com/t5/data-engineering/long-running-python-http-post-hangs/m-p/140671#M51504</link>
      <description>&lt;P&gt;Hello,&lt;BR /&gt;IMHO, having a HTTP related task in a Spark cluster is an anti-pattern. This kind of code executes at the Driver, it will be synchronous and adds overhead. This is one of the reasons, DLT (or SDP - Spark Declarative Pipeline) does not have REST based tasks.&lt;BR /&gt;&lt;BR /&gt;Please review if this task can be done outside Databricks like below,&lt;BR /&gt;1) Event based trigger:&amp;nbsp;push the result from Databricks to cloud storage; and this creates an event (Event grid) to a listener like Function/Logic App that will perform HTTP task&lt;BR /&gt;2) Classic Poller: Azure Function App to check for an expectation every 'n' mins. if met; execute the HTTP task&lt;/P&gt;</description>
      <pubDate>Sun, 30 Nov 2025 21:18:14 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/long-running-python-http-post-hangs/m-p/140671#M51504</guid>
      <dc:creator>siva-anantha</dc:creator>
      <dc:date>2025-11-30T21:18:14Z</dc:date>
    </item>
  </channel>
</rss>

