cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Knowledge Sharing Hub
Dive into a collaborative space where members like YOU can exchange knowledge, tips, and best practices. Join the conversation today and unlock a wealth of collective wisdom to enhance your experience and drive success.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Set up AI-driven optimizations in Databricks SQL

Sujitha
Databricks Employee
Databricks Employee

With Predictive I/O for reads (GA) and updates (Public Preview), Databricks SQL can now analyze historical read and write patterns to intelligently build indexes and optimize DELETE, MERGE, and UPDATE operations.

What is Predictive I/O?

Predictive I/O is a collection of Databricks optimizations that improve performance for data interactions. Predictive I/O capabilities are grouped into the following categories:

  • Accelerated reads reduce the time it takes to scan and read data.
  • Accelerated updates reduce the amount of data that needs to be rewritten during updates, deletes, and merges.

Predictive I/O leverages deletion vectors to accelerate updates by reducing the frequency of full file rewrites during data modification on Delta tables. Predictive I/O optimizes Delete, MERGE, and UPDATE operations.

Rather than rewriting all records in a data file when any record is updated or deleted, predictive I/O uses deletion vectors to indicate records have been removed from the target data files. Supplemental data files are used to indicate updates.

How to get started:

1. Use serverless and pro types of SQL warehouses + Photon-accelerated clusters running Databricks Runtime 11.2 and above.

2. Enable support for deletion vectors on a Delta Lake table by setting a Delta Lake table property as shown following:

ALTER TABLE <table-name> SET TBLPROPERTIES ('delta.enableDeletionVectors' = true);

Deletion vectors are a storage optimization feature that can be enabled on Delta Lake tables. Click here to learn more.


Things to consider:

When you enable deletion vectors, the table protocol version is upgraded. Table protocol version upgrades are not reversible. After upgrading, the table will not be readable by Delta Lake clients that do not support deletion vectors. See How does Databricks manage Delta Lake feature compatibility?

Predictive I/O updates share all limitations with deletion vectors. In Databricks Runtime 12.1 and greater, the following limitations exist:

 

Resources:

0 REPLIES 0

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group