<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: python multiprocessing hangs at map on one cluster but works fine on another in Get Started Discussions</title>
    <link>https://community.databricks.com/t5/get-started-discussions/python-multiprocessing-hangs-at-map-on-one-cluster-but-works/m-p/72735#M7441</link>
    <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/102253"&gt;@jacovangelder&lt;/a&gt;&amp;nbsp; Thanks for the response. Actually the inconsistent behavior is what I am not able to understand at the moment as I mentioned in my response to &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/97998"&gt;@raphaelblg&lt;/a&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Wed, 12 Jun 2024 06:26:44 GMT</pubDate>
    <dc:creator>mh-hsn</dc:creator>
    <dc:date>2024-06-12T06:26:44Z</dc:date>
    <item>
      <title>python multiprocessing hangs at map on one cluster but works fine on another</title>
      <link>https://community.databricks.com/t5/get-started-discussions/python-multiprocessing-hangs-at-map-on-one-cluster-but-works/m-p/72232#M7437</link>
      <description>&lt;P&gt;I have a simple python script which have been running fine on my cluster but recently the same script gets stuck at map. So I tried creating a new cluster with less resources and tried to run the same script over that and it ran just fine.&lt;/P&gt;&lt;P&gt;Here are the specifications of my cluster:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;&lt;STRONG&gt;&lt;STRONG&gt;My old cluster:&lt;/STRONG&gt;&lt;/STRONG&gt;&lt;DIV&gt;&lt;SPAN&gt;&lt;SPAN&gt;9.1-LTS ML (includes Apache Spark 3.1.2, Scala 2.12)&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;DIV&gt;&lt;SPAN&gt;&lt;SPAN&gt;Worker type: Standard_D16s_v3 (min "1", max "8" )&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;DIV&gt;&lt;SPAN&gt;&lt;SPAN&gt;Driver type: Standard_D64s_v3&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;DIV&gt;&lt;SPAN&gt;&lt;SPAN&gt;Spot instances = True&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;DIV&gt;&amp;nbsp;&lt;DIV&gt;&lt;STRONG&gt;&lt;STRONG&gt;My new cluster:&lt;/STRONG&gt;&lt;/STRONG&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&lt;SPAN&gt;9.1-LTS ML (includes Apache Spark 3.1.2, Scala 2.12)&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;DIV&gt;&lt;SPAN&gt;&lt;SPAN&gt;Worker type: Standard_DS3_v2 (min "1", max "8" )&lt;BR /&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;DIV&gt;&lt;SPAN&gt;&lt;SPAN&gt;Driver type: Standard_DS3_v2&lt;BR /&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;DIV&gt;&lt;SPAN&gt;&lt;SPAN&gt;Spot instances = True&lt;BR /&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;import multiprocessing
from functools import partial


# Define the function to process each row
def process_row(row, func):
    index, data = row
    if data['some_new_column'] == '':
        data['some_new_column'] = func(data['text'])
    return index, data


# Define the function for parallel processing
def parallel_process(data, func, num_processes):
    pool = multiprocessing.Pool(processes=num_processes)
    func_partial = partial(process_row, func=func)
    print('Stating mapping. . . ')
    processed_data = pool.map(func_partial, data)
    pool.close()
    pool.join()
    return processed_data

num_processes = multiprocessing.cpu_count()

# Apply parallel processing to speed up the operation
processed_data = parallel_process(models_df.iterrows(), my_custom_func, num_processes)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 10 Jun 2024 13:42:12 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/python-multiprocessing-hangs-at-map-on-one-cluster-but-works/m-p/72232#M7437</guid>
      <dc:creator>mh-hsn</dc:creator>
      <dc:date>2024-06-10T13:42:12Z</dc:date>
    </item>
    <item>
      <title>Re: python multiprocessing hangs at map on one cluster but works fine on another</title>
      <link>https://community.databricks.com/t5/get-started-discussions/python-multiprocessing-hangs-at-map-on-one-cluster-but-works/m-p/72266#M7438</link>
      <description>&lt;P&gt;Hello &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/106674"&gt;@mh-hsn&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;
&lt;P&gt;First of all, let me clarify that all your multiprocessing is occurring in your driver node, where the main Python thread is active in your cluster. It's important to note that your executors' configuration does not impact this scenario.&lt;/P&gt;
&lt;P&gt;In your old cluster (Standard_D64s_v3 driver), you have 64 vCPUs available, allowing you to instantiate up to 64 simultaneous threads on the driver node. However, in your new cluster (Standard_DS3_v2 driver), there are only 4 vCPUs available, which may limit you to just 4 threads.&lt;/P&gt;
&lt;P&gt;Running 64 parallel threads can be resource-intensive, even for the Standard_D64s_v3 driver. The reason your driver gets stuck could be due to OOM issues or excessive Python GC. The Standard_DS3_v2 driver is less capable, but 4 threads might not be enough to cause OOM or excessive GC.&lt;/P&gt;
&lt;P&gt;Without memory metrics and cluster logs, it's challenging to confirm the root cause. However, based on your description, I believe my assessment is likely accurate.&lt;/P&gt;</description>
      <pubDate>Mon, 10 Jun 2024 20:16:10 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/python-multiprocessing-hangs-at-map-on-one-cluster-but-works/m-p/72266#M7438</guid>
      <dc:creator>raphaelblg</dc:creator>
      <dc:date>2024-06-10T20:16:10Z</dc:date>
    </item>
    <item>
      <title>Re: python multiprocessing hangs at map on one cluster but works fine on another</title>
      <link>https://community.databricks.com/t5/get-started-discussions/python-multiprocessing-hangs-at-map-on-one-cluster-but-works/m-p/72353#M7439</link>
      <description>&lt;P&gt;I agree with&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/97998"&gt;@raphaelblg&lt;/a&gt;. Most likely you're running out of memory. Multiprocessing or threadpools unfortunately do not benefit from extra workers as they only run on your driver node. This is very annoying and not a very known fact. Spark driver also often bugs because of it. I've read that a for each activity is currently in private preview, this will hopefully resolve some of these issues.&lt;/P&gt;</description>
      <pubDate>Tue, 11 Jun 2024 16:54:16 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/python-multiprocessing-hangs-at-map-on-one-cluster-but-works/m-p/72353#M7439</guid>
      <dc:creator>jacovangelder</dc:creator>
      <dc:date>2024-06-11T16:54:16Z</dc:date>
    </item>
    <item>
      <title>Re: python multiprocessing hangs at map on one cluster but works fine on another</title>
      <link>https://community.databricks.com/t5/get-started-discussions/python-multiprocessing-hangs-at-map-on-one-cluster-but-works/m-p/72732#M7440</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/97998"&gt;@raphaelblg&lt;/a&gt;Thanks you for your response, really appreciate that. After your answer, I was able to resolve the issue by reducing "num_processes" in my old cluster. But the thing that I still don't understand is why it has been executing successfully in past. Because the function that I am applying on my dataframe rows, it just do some api calls. So the inconsistent behavior is what I do not understand at the moment. Anyways, thanks again for the help.&lt;/P&gt;</description>
      <pubDate>Wed, 12 Jun 2024 06:22:53 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/python-multiprocessing-hangs-at-map-on-one-cluster-but-works/m-p/72732#M7440</guid>
      <dc:creator>mh-hsn</dc:creator>
      <dc:date>2024-06-12T06:22:53Z</dc:date>
    </item>
    <item>
      <title>Re: python multiprocessing hangs at map on one cluster but works fine on another</title>
      <link>https://community.databricks.com/t5/get-started-discussions/python-multiprocessing-hangs-at-map-on-one-cluster-but-works/m-p/72735#M7441</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/102253"&gt;@jacovangelder&lt;/a&gt;&amp;nbsp; Thanks for the response. Actually the inconsistent behavior is what I am not able to understand at the moment as I mentioned in my response to &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/97998"&gt;@raphaelblg&lt;/a&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 12 Jun 2024 06:26:44 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/python-multiprocessing-hangs-at-map-on-one-cluster-but-works/m-p/72735#M7441</guid>
      <dc:creator>mh-hsn</dc:creator>
      <dc:date>2024-06-12T06:26:44Z</dc:date>
    </item>
    <item>
      <title>Re: python multiprocessing hangs at map on one cluster but works fine on another</title>
      <link>https://community.databricks.com/t5/get-started-discussions/python-multiprocessing-hangs-at-map-on-one-cluster-but-works/m-p/72742#M7442</link>
      <description>&lt;P&gt;Its very difficult to pinpoint the exact issue on the driver node really. I've seen some very inconsistent behaviour myself this week using threadpools/multiprocessing. One day it will run fine, other day the Spark driver would bug out running the same (vanilla Python, not even Spark) workload. While having plenty of memory still free. My best advice would be to reconsider the multiprocessing/threadpools and perhaps have a little less parallelism but instead more consistency.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 12 Jun 2024 07:31:26 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/python-multiprocessing-hangs-at-map-on-one-cluster-but-works/m-p/72742#M7442</guid>
      <dc:creator>jacovangelder</dc:creator>
      <dc:date>2024-06-12T07:31:26Z</dc:date>
    </item>
  </channel>
</rss>

