<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Is Photon Acceleration Helpful for All Maintenance Tasks (OPTIMIZE, VACUUM, ANALYZE_COMPUTE_STATS)? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/is-photon-acceleration-helpful-for-all-maintenance-tasks/m-p/130354#M48768</link>
    <description>&lt;P&gt;Hi everyone,&lt;/P&gt;&lt;P&gt;We’re currently reviewing the performance impact of enabling Photon acceleration on our Databricks jobs, particularly those involving table maintenance tasks. Our job includes three main operations: OPTIMIZE, VACUUM, and ANALYZE_COMPUTE_STATS. We’ve observed that enabling Photon significantly improves the performance of the ANALYZE_COMPUTE_STATS task—it runs much faster when Photon is enabled on the cluster.&lt;/P&gt;&lt;P&gt;Given that, I’m wondering if enabling Photon for the other two tasks (OPTIMIZE and VACUUM) would also lead to better performance or reduced job time. Has anyone experienced improvements in these tasks with Photon?&lt;/P&gt;&lt;P&gt;Also, more generally, I’d like to understand which types of tasks or workloads benefit most from Photon acceleration.&lt;/P&gt;&lt;P&gt;Any insights, benchmarks, or shared experiences would be really helpful. Thanks!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 01 Sep 2025 11:08:47 GMT</pubDate>
    <dc:creator>Sainath368</dc:creator>
    <dc:date>2025-09-01T11:08:47Z</dc:date>
    <item>
      <title>Is Photon Acceleration Helpful for All Maintenance Tasks (OPTIMIZE, VACUUM, ANALYZE_COMPUTE_STATS)?</title>
      <link>https://community.databricks.com/t5/data-engineering/is-photon-acceleration-helpful-for-all-maintenance-tasks/m-p/130354#M48768</link>
      <description>&lt;P&gt;Hi everyone,&lt;/P&gt;&lt;P&gt;We’re currently reviewing the performance impact of enabling Photon acceleration on our Databricks jobs, particularly those involving table maintenance tasks. Our job includes three main operations: OPTIMIZE, VACUUM, and ANALYZE_COMPUTE_STATS. We’ve observed that enabling Photon significantly improves the performance of the ANALYZE_COMPUTE_STATS task—it runs much faster when Photon is enabled on the cluster.&lt;/P&gt;&lt;P&gt;Given that, I’m wondering if enabling Photon for the other two tasks (OPTIMIZE and VACUUM) would also lead to better performance or reduced job time. Has anyone experienced improvements in these tasks with Photon?&lt;/P&gt;&lt;P&gt;Also, more generally, I’d like to understand which types of tasks or workloads benefit most from Photon acceleration.&lt;/P&gt;&lt;P&gt;Any insights, benchmarks, or shared experiences would be really helpful. Thanks!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 01 Sep 2025 11:08:47 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/is-photon-acceleration-helpful-for-all-maintenance-tasks/m-p/130354#M48768</guid>
      <dc:creator>Sainath368</dc:creator>
      <dc:date>2025-09-01T11:08:47Z</dc:date>
    </item>
    <item>
      <title>Re: Is Photon Acceleration Helpful for All Maintenance Tasks (OPTIMIZE, VACUUM, ANALYZE_COMPUTE_STAT</title>
      <link>https://community.databricks.com/t5/data-engineering/is-photon-acceleration-helpful-for-all-maintenance-tasks/m-p/130359#M48769</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/166046"&gt;@Sainath368&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;I wouldn't use photon for this kind of task. You should use it primarly for ETL transformations where it shines.&lt;BR /&gt;VACUUM and OPTIMIZE are more of maintenance tasks and using photon would be pricey overkill here.&lt;/P&gt;&lt;P&gt;According to documentation, it is recommended to enable Photon&amp;nbsp; for workloads with the following characteristics:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;ETL pipelines consisting of Delta MERGE operations&lt;/LI&gt;&lt;LI&gt;Writing large volumes of data to cloud storage (Delta/Parquet)&lt;/LI&gt;&lt;LI&gt;Scans of large data sets, joins, aggregations and decimal computations&lt;/LI&gt;&lt;LI&gt;Auto Loader to incrementally and efficiently process new data arriving in storage&lt;/LI&gt;&lt;LI&gt;Interactive/ad hoc queries using SQL&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;SPAN&gt;Regarding&amp;nbsp;&lt;/SPAN&gt;advantages of Photon:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Accelerated queries that process a significant amount of data (&amp;gt; 100GB) and include aggregations and joins&lt;/LI&gt;&lt;LI&gt;Faster performance when data is accessed repeatedly from the Delta cache&lt;/LI&gt;&lt;LI&gt;More robust scan/read performance on tables with many columns and many small files&lt;/LI&gt;&lt;LI&gt;Faster Delta writing using UPDATE, DELETE, MERGE INTO, INSERT, and CREATE TABLE AS SELECT&lt;/LI&gt;&lt;LI&gt;Join improvements&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;A href="https://www.databricks.com/discover/pages/optimize-data-workloads-guide" target="_blank" rel="noopener"&gt;Comprehensive Guide to Optimize Data Workloads | Databricks&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;For instance, for VACUUM databricks recommends to use compute optimized instances. And since OPTIMIZE is also compute intensive I guess it also applies to it.&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="szymon_dybczak_0-1756727080230.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/19535i85C9884BA45F1740/image-size/medium?v=v2&amp;amp;px=400" role="button" title="szymon_dybczak_0-1756727080230.png" alt="szymon_dybczak_0-1756727080230.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 01 Sep 2025 11:48:28 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/is-photon-acceleration-helpful-for-all-maintenance-tasks/m-p/130359#M48769</guid>
      <dc:creator>szymon_dybczak</dc:creator>
      <dc:date>2025-09-01T11:48:28Z</dc:date>
    </item>
  </channel>
</rss>

