<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Photon enabled UC cluster has less executor memory(1/4th) compared to normal cluster. in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/photon-enabled-uc-cluster-has-less-executor-memory-1-4th/m-p/103902#M41597</link>
    <description>&lt;P class="_1t7bu9h1 paragraph"&gt;Enabling Photon Acceleration on your Databricks cluster reduces the available executor memory because Photon uses a different memory management strategy compared to standard Spark. Photon is designed to optimize performance by leveraging the underlying hardware more efficiently, but this comes at the cost of reduced memory allocation for Spark executors.&lt;/P&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;SPAN&gt;To address the issue of reduced executor memory when Photon is enabled, you can try the following approaches:&lt;/SPAN&gt;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;SPAN&gt;&lt;STRONG&gt;Increase the Node Size&lt;/STRONG&gt;: Upgrade your cluster to use larger node types with more memory. For example, you can switch from &lt;CODE&gt;Standard_DS4_v2&lt;/CODE&gt; to &lt;CODE&gt;Standard_DS5_v2&lt;/CODE&gt;, which provides more memory and CPU resources.&lt;/SPAN&gt;&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;SPAN&gt;&lt;STRONG&gt;Adjust Spark Configuration&lt;/STRONG&gt;: You can fine-tune Spark configurations to optimize memory usage. For instance, increasing the number of shuffle partitions can help distribute the workload more evenly and reduce memory pressure on individual executors. You can set this configuration at the cluster level:&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;DIV class="gb5fhw2"&gt;
&lt;PRE&gt;&lt;CODE class="markdown-code-plaintext _1t7bu9hb hljs language-yaml gb5fhw3"&gt;&lt;SPAN class="hljs-string"&gt;spark.sql.shuffle.partitions&lt;/SPAN&gt; &lt;SPAN class="hljs-number"&gt;1000&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;/DIV&gt;
&lt;/LI&gt;
&lt;/OL&gt;</description>
    <pubDate>Thu, 02 Jan 2025 12:05:00 GMT</pubDate>
    <dc:creator>Walter_C</dc:creator>
    <dc:date>2025-01-02T12:05:00Z</dc:date>
    <item>
      <title>Photon enabled UC cluster has less executor memory(1/4th) compared to normal cluster.</title>
      <link>https://community.databricks.com/t5/data-engineering/photon-enabled-uc-cluster-has-less-executor-memory-1-4th/m-p/103896#M41596</link>
      <description>&lt;P&gt;I have a Unity Catalog Enabled cluster with Node type Standard_DS4_v2 (28 GB Memory, 8 Cores). When &lt;EM&gt;"Use Photon Acceleration"&lt;/EM&gt; option is disabled &lt;EM&gt;spark.executor.memory is 18409m&lt;/EM&gt;. But if I enable Photon Acceleration it shows &lt;EM&gt;spark.executor.memory as 4602m&lt;/EM&gt;. Due to this most of the code which I have written is failing giving an error&lt;/P&gt;&lt;P&gt;&lt;EM&gt;org.apache.spark.memory.SparkOutOfMemoryError: Photon ran out of memory while executing this query.&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Photon Enabled Cluster: &lt;/STRONG&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Spark Version: 13.3.x-photon-scala2.12&lt;/LI&gt;&lt;LI&gt;Executor Memory: 4602m&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;STRONG&gt;Photon Disabled Cluster: &lt;/STRONG&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Spark Version: 13.3.x-scala2.12&lt;/LI&gt;&lt;LI&gt;Executor Memory: 18409m&lt;/LI&gt;&lt;/UL&gt;&lt;OL&gt;&lt;LI&gt;Why enabling photon reduces the executor memory?&lt;/LI&gt;&lt;LI&gt;Is there a way to keep spark.executor.memory same as 18409m with photon feature enabled?&lt;/LI&gt;&lt;/OL&gt;</description>
      <pubDate>Thu, 02 Jan 2025 11:52:04 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/photon-enabled-uc-cluster-has-less-executor-memory-1-4th/m-p/103896#M41596</guid>
      <dc:creator>Einsatz</dc:creator>
      <dc:date>2025-01-02T11:52:04Z</dc:date>
    </item>
    <item>
      <title>Re: Photon enabled UC cluster has less executor memory(1/4th) compared to normal cluster.</title>
      <link>https://community.databricks.com/t5/data-engineering/photon-enabled-uc-cluster-has-less-executor-memory-1-4th/m-p/103902#M41597</link>
      <description>&lt;P class="_1t7bu9h1 paragraph"&gt;Enabling Photon Acceleration on your Databricks cluster reduces the available executor memory because Photon uses a different memory management strategy compared to standard Spark. Photon is designed to optimize performance by leveraging the underlying hardware more efficiently, but this comes at the cost of reduced memory allocation for Spark executors.&lt;/P&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;SPAN&gt;To address the issue of reduced executor memory when Photon is enabled, you can try the following approaches:&lt;/SPAN&gt;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;SPAN&gt;&lt;STRONG&gt;Increase the Node Size&lt;/STRONG&gt;: Upgrade your cluster to use larger node types with more memory. For example, you can switch from &lt;CODE&gt;Standard_DS4_v2&lt;/CODE&gt; to &lt;CODE&gt;Standard_DS5_v2&lt;/CODE&gt;, which provides more memory and CPU resources.&lt;/SPAN&gt;&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;SPAN&gt;&lt;STRONG&gt;Adjust Spark Configuration&lt;/STRONG&gt;: You can fine-tune Spark configurations to optimize memory usage. For instance, increasing the number of shuffle partitions can help distribute the workload more evenly and reduce memory pressure on individual executors. You can set this configuration at the cluster level:&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;DIV class="gb5fhw2"&gt;
&lt;PRE&gt;&lt;CODE class="markdown-code-plaintext _1t7bu9hb hljs language-yaml gb5fhw3"&gt;&lt;SPAN class="hljs-string"&gt;spark.sql.shuffle.partitions&lt;/SPAN&gt; &lt;SPAN class="hljs-number"&gt;1000&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;/DIV&gt;
&lt;/LI&gt;
&lt;/OL&gt;</description>
      <pubDate>Thu, 02 Jan 2025 12:05:00 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/photon-enabled-uc-cluster-has-less-executor-memory-1-4th/m-p/103902#M41597</guid>
      <dc:creator>Walter_C</dc:creator>
      <dc:date>2025-01-02T12:05:00Z</dc:date>
    </item>
    <item>
      <title>Re: Photon enabled UC cluster has less executor memory(1/4th) compared to normal cluster.</title>
      <link>https://community.databricks.com/t5/data-engineering/photon-enabled-uc-cluster-has-less-executor-memory-1-4th/m-p/103992#M41626</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/139768"&gt;@Einsatz&lt;/a&gt;&amp;nbsp;thanks for your question!&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;1) Why does enabling Photon reduce the executor memory?&lt;/STRONG&gt;&lt;BR /&gt;Photon allocates a significant portion of memory off-heap for its C++ engine. As a result, the on-heap memory (shown by spark.executor.memory) appears lower once Photon is enabled.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;2) Is there a way to keep spark.executor.memory at 18409m when Photon is enabled?&lt;/STRONG&gt;&lt;BR /&gt;Not directly. You must either increase your node’s total memory (e.g., choose a larger instance type) or adjust off-heap allocations to accommodate Photon’s requirements.&lt;/P&gt;
&lt;P&gt;Photon is a separate C++ engine embedded within Spark to accelerate certain SQL workloads. So, it requires its own memory space, therefore you can either provision extra memory for Photon or run those queries in the regular Spark engine with full on-heap capacity. It comes with a cost which you need to balance and account accordingly.&lt;/P&gt;</description>
      <pubDate>Thu, 02 Jan 2025 18:27:29 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/photon-enabled-uc-cluster-has-less-executor-memory-1-4th/m-p/103992#M41626</guid>
      <dc:creator>VZLA</dc:creator>
      <dc:date>2025-01-02T18:27:29Z</dc:date>
    </item>
    <item>
      <title>Re: Photon enabled UC cluster has less executor memory(1/4th) compared to normal cluster.</title>
      <link>https://community.databricks.com/t5/data-engineering/photon-enabled-uc-cluster-has-less-executor-memory-1-4th/m-p/104095#M41664</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/34618"&gt;@VZLA&lt;/a&gt;/&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/88823"&gt;@Walter_C&lt;/a&gt;&amp;nbsp;&amp;nbsp; Thanks for the quick answers! I understand that the Photon engine requires memory for its optimization tasks, and this memory usage impacts the executor memory.&lt;/P&gt;&lt;P&gt;I’ve got few more questions, and I’d really appreciate it if you could help me out.&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;Is the memory allocated to the Photon engine fixed, or is it based on a percentage of the node’s total memory?&lt;/LI&gt;&lt;LI&gt;How can I calculate the value of &lt;EM&gt;spark.executor.memory&lt;/EM&gt; based on a specific node type? I’ve gone through some articles to understand Spark's memory allocation, but the results don’t match the &lt;EM&gt;spark.executor.memory&lt;/EM&gt; value set by Databricks.&lt;/LI&gt;&lt;LI&gt;I need clarification on how memory is allocated and the memory values displayed on different tabs of the Databricks Spark UI. Below are the configuration values for my node type, &lt;EM&gt;Standard_DS4_v2 (28GB RAM, 8 cores)&lt;/EM&gt;.&lt;BR /&gt;&lt;OL&gt;&lt;LI&gt;What does the &lt;EM&gt;'Storage Memory'&lt;/EM&gt; column in the &lt;EM&gt;Spark UI -&amp;gt; Executors&lt;/EM&gt; represent? In my case, it shows &lt;EM&gt;9.4GB&lt;/EM&gt;. I assume this is half of &lt;EM&gt;18409m,&lt;/EM&gt; so does it indicate only the storage memory portion of the executor, which is 50% of the total executor memory? If so, can I conclude that the remaining 9.4GB is used for execution memory?&lt;/LI&gt;&lt;LI&gt;What is &lt;EM&gt;spark.executor.memory&lt;/EM&gt; (18409m = 17.97GB)? How can I calculate this value based on a specific node type &lt;EM&gt;X&lt;/EM&gt; (similar to question 2 asked above)?&lt;/LI&gt;&lt;LI&gt;What does the &lt;EM&gt;'Memory'&lt;/EM&gt; column in the &lt;EM&gt;Spark compute UI - Master -&amp;gt; Workers&lt;/EM&gt;&amp;nbsp;represent? It's showing &lt;EM&gt;22.5GiB (18.0GiB used)&lt;/EM&gt;. I assume &lt;EM&gt;18.0GiB&lt;/EM&gt; corresponds to &lt;EM&gt;18409m&lt;/EM&gt;, but what does the &lt;EM&gt;22.5GiB&lt;/EM&gt; indicate, considering the node memory is &lt;EM&gt;28GB&lt;/EM&gt;?&lt;/LI&gt;&lt;LI&gt;&lt;SPAN&gt;What does the &lt;EM&gt;'Memory per Executor'&lt;/EM&gt; column in the &lt;EM&gt;Spark compute UI - Master -&amp;gt; Running Applications&lt;/EM&gt; refer to? It shows &lt;EM&gt;18409m&lt;/EM&gt;. Is this the same as the value in question 2.&lt;/SPAN&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;/LI&gt;&lt;/OL&gt;</description>
      <pubDate>Fri, 03 Jan 2025 15:28:52 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/photon-enabled-uc-cluster-has-less-executor-memory-1-4th/m-p/104095#M41664</guid>
      <dc:creator>Einsatz</dc:creator>
      <dc:date>2025-01-03T15:28:52Z</dc:date>
    </item>
    <item>
      <title>Re: Photon enabled UC cluster has less executor memory(1/4th) compared to normal cluster.</title>
      <link>https://community.databricks.com/t5/data-engineering/photon-enabled-uc-cluster-has-less-executor-memory-1-4th/m-p/104096#M41665</link>
      <description>&lt;P class="_1t7bu9h1 paragraph"&gt;The memory allocated to the Photon engine is not fixed; it is based on a percentage of the node’s total memory.&lt;/P&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;SPAN&gt;To calculate the value of &lt;CODE&gt;spark.executor.memory&lt;/CODE&gt; based on a specific node type, you can use the following formula:&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;CODE&gt;
container_size = (vm_size * 0.97 - 4800MB)
&lt;BR /&gt;spark.executor.memory = (0.8 * container_size)
&lt;/CODE&gt; &lt;BR /&gt;For your node type, Standard_DS4_v2 (28GB RAM, 8 cores), the calculation would be: &lt;CODE&gt;
container_size = (28GB * 0.97 - 4800MB)
&lt;BR /&gt;spark.executor.memory = (0.8 * container_size)
&lt;/CODE&gt; &lt;BR /&gt;This results in approximately 17.97GB (18409m).&lt;/P&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;SPAN&gt;Regarding the 'Storage Memory' column in the Spark UI -&amp;gt; Executors, it represents the amount of memory allocated for storage (caching) within the executor. In your case, it shows 9.4GB, which is half of the total executor memory (18409m). This indicates that 50% of the total executor memory is allocated for storage memory, and the remaining 50% is used for execution memory.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;SPAN&gt;The 'Memory' column in the Spark compute UI - Master -&amp;gt; Workers represents the total memory allocated to the worker node. The 22.5GiB (18.0GiB used) indicates that 18.0GiB corresponds to the &lt;CODE&gt;spark.executor.memory&lt;/CODE&gt; value (18409m), and the remaining memory is used by other processes and overheads.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;SPAN&gt;The 'Memory per Executor' column in the Spark compute UI - Master -&amp;gt; Running Applications refers to the memory allocated per executor, which in your case is 18409m. This is the same value as the &lt;CODE&gt;spark.executor.memory&lt;/CODE&gt; calculated above&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 03 Jan 2025 15:31:20 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/photon-enabled-uc-cluster-has-less-executor-memory-1-4th/m-p/104096#M41665</guid>
      <dc:creator>Walter_C</dc:creator>
      <dc:date>2025-01-03T15:31:20Z</dc:date>
    </item>
  </channel>
</rss>

