<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Set up AI-driven optimizations in Databricks SQL in Community Articles</title>
    <link>https://community.databricks.com/t5/community-articles/set-up-ai-driven-optimizations-in-databricks-sql/m-p/43916#M76</link>
    <description>&lt;P&gt;&lt;SPAN&gt;With Predictive I/O for reads (GA) and updates (Public Preview), Databricks SQL can now analyze historical read and write patterns to intelligently build indexes and optimize DELETE, MERGE, and UPDATE operations.&lt;/SPAN&gt;&lt;/P&gt;
&lt;H3&gt;&lt;STRONG&gt;What is Predictive I/O?&lt;/STRONG&gt;&lt;/H3&gt;
&lt;P&gt;Predictive I/O is a collection of Databricks optimizations that improve performance for data interactions. Predictive I/O capabilities are grouped into the following categories:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Accelerated reads reduce the time it takes to scan and read data.&lt;/LI&gt;
&lt;LI&gt;Accelerated updates reduce the amount of data that needs to be rewritten during updates, deletes, and merges.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Predictive I/O leverages deletion vectors to accelerate updates by reducing the frequency of full file rewrites during data modification on Delta tables. Predictive I/O optimizes &lt;CODE&gt;Delete, MERGE, and UPDATE operations.&lt;/CODE&gt;&lt;/P&gt;
&lt;P&gt;Rather than rewriting all records in a data file when any record is updated or deleted, predictive I/O uses deletion vectors to indicate records have been removed from the target data files. Supplemental data files are used to indicate updates.&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;H3&gt;&lt;STRONG&gt;How to get started:&lt;/STRONG&gt;&lt;/H3&gt;
&lt;P&gt;1. Use serverless and pro types of SQL warehouses + Photon-accelerated clusters running Databricks Runtime 11.2 and above.&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;2. Enable support for &lt;A href="https://docs.databricks.com/en/delta/deletion-vectors.html" target="_blank" rel="noopener"&gt;deletion vectors&lt;/A&gt; on a Delta Lake table by setting a Delta Lake table property as shown following: &lt;/SPAN&gt;&lt;/P&gt;
&lt;PRE&gt;&lt;SPAN&gt;ALTER TABLE &amp;lt;table-name&amp;gt; SET TBLPROPERTIES ('delta.enableDeletionVectors' = true);&lt;/SPAN&gt;&lt;/PRE&gt;
&lt;P&gt;Deletion vectors are a storage optimization feature that can be enabled on Delta Lake tables. Click &lt;A href="https://docs.databricks.com/en/delta/deletion-vectors.html" target="_self"&gt;here to learn more.&lt;/A&gt;&lt;/P&gt;
&lt;H3&gt;&lt;STRONG&gt;&lt;BR /&gt;Things to consider:&lt;/STRONG&gt;&lt;/H3&gt;
&lt;P&gt;&lt;SPAN&gt;When you enable deletion vectors, the table protocol version is upgraded. &lt;STRONG&gt;Table protocol version upgrades are not reversible.&lt;/STRONG&gt; After upgrading, the table will not be readable by Delta Lake clients that do not support deletion vectors. See &lt;A class="" href="https://docs.databricks.com/en/delta/feature-compatibility.html" target="_blank" rel="noopener"&gt; &lt;SPAN&gt;How does Databricks manage Delta Lake feature compatibility?&lt;/SPAN&gt; &lt;/A&gt; &lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Predictive I/O updates share all limitations with deletion vectors. In Databricks Runtime 12.1 and greater, the following limitations exist:&lt;/SPAN&gt;&lt;/P&gt;
&lt;UL class=""&gt;
&lt;LI&gt;
&lt;P&gt;Delta Sharing is not supported on tables with deletion vectors enabled.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P&gt;You cannot &lt;A class="" href="https://docs.databricks.com/en/delta/generate-manifest.html" target="_blank" rel="noopener"&gt; &lt;SPAN&gt;generate a manifest file for a table with deletion vectors present&lt;/SPAN&gt; &lt;/A&gt;. Run &lt;CODE&gt;REORG TABLE ... APPLY (PURGE)&lt;/CODE&gt; and ensure no concurrent write operations are running in order to generate a manifest.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P&gt;&lt;SPAN&gt;You cannot &lt;/SPAN&gt;&lt;A style="font-family: inherit; background-color: #ffffff;" href="https://docs.delta.io/latest/presto-integration.html#step-3-update-manifests" target="_blank" rel="noopener"&gt;incrementally generate manifest files for a table with deletion vectors enabled.&lt;/A&gt;&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;H3&gt;&lt;STRONG&gt;Resources:&lt;/STRONG&gt;&lt;/H3&gt;
&lt;UL class=""&gt;
&lt;LI&gt;&lt;SPAN&gt; &lt;A href="https://www.youtube.com/watch?v=TB8xrVeAbUo" target="_self"&gt;What's new in Databricks SQL (video)&lt;/A&gt; &lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://docs.databricks.com/en/optimizations/predictive-io.html#:~:text=Predictive%20I%2FO%20leverages%20deletion,%2C%20MERGE%20%2C%20and%20UPDATE%20operations." target="_self"&gt; &lt;SPAN&gt;What is Predictive I/O? (doc)&lt;/SPAN&gt; &lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://youtu.be/h4z4vBoxQ6s?t=6560" target="_self"&gt;Reynold Xin's Keynote (video)&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN&gt; &lt;A href="https://www.databricks.com/dataaisummit/session/databricks-sql-serverless-under-hood-how-we-use-ml-get-best-priceperformance/" target="_blank" rel="noopener"&gt;Databricks SQL Serverless Under the Hood: How We Use ML to Get the Best Price/Performance (video)&lt;/A&gt; &lt;/SPAN&gt;&lt;/LI&gt;
&lt;/UL&gt;</description>
    <pubDate>Thu, 14 Sep 2023 08:05:19 GMT</pubDate>
    <dc:creator>Sujitha</dc:creator>
    <dc:date>2023-09-14T08:05:19Z</dc:date>
    <item>
      <title>Set up AI-driven optimizations in Databricks SQL</title>
      <link>https://community.databricks.com/t5/community-articles/set-up-ai-driven-optimizations-in-databricks-sql/m-p/43916#M76</link>
      <description>&lt;P&gt;&lt;SPAN&gt;Learn how Databricks SQL can analyze historical read and write patterns to intelligently build indexes and optimize DELETE, MERGE, and UPDATE operations.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 14 Sep 2023 08:05:19 GMT</pubDate>
      <guid>https://community.databricks.com/t5/community-articles/set-up-ai-driven-optimizations-in-databricks-sql/m-p/43916#M76</guid>
      <dc:creator>Sujitha</dc:creator>
      <dc:date>2023-09-14T08:05:19Z</dc:date>
    </item>
  </channel>
</rss>

