cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Vector Search index not indexing the whole Delta table

Nicolaus
New Contributor II

I have a Delta table that Iโ€™m trying to index but when I try to create a vector search index with either the UI or the Python SDK, it only indexes 1 row out of my 3000 rows. I have tried using different vector search endpoints. 

I have verified the following:

  • I have a Unity Catalog enabled workspace.
  • Serverless compute is enabled. 
  • My source table has Change Data Feed enabled.
  • I have CREATE TABLE privileges.

Has anyone encountered the same issue?

2 REPLIES 2

Kaniz_Fatma
Community Manager
Community Manager

Hi @Nicolaus

  1. Ensure that the Delta table is properly structured and that the columns used for indexing are correctly defined. Verify that the columns are of the correct data types and are not nullable.
  2. If the table is being populated through trickle insertion, it might be causing the issue. Trickle insertion can lead to multiple open delta rowgroups being created, which can affect indexing. Consider using a single insert statement for larger batches of data.
  3. If the table has a columnstore index, it might be causing the issue. Columnstore indexes can lead to multiple open delta rowgroups if the insert statements are not properly batched. Consider using a single insert statement for larger batches of data.
  4. Large Delta tables can cause indexing issues. If the table is very large, it might be necessary to partition the table or use a more efficient indexing strategy.
  5. Ensure that Unity Catalog and serverless compute are properly configured and that there are no issues with the underlying infrastructure. Verify that the compute resources are sufficient to handle the indexing process.
  6. Ensure that you have the necessary CREATE TABLE privileges to create the vector search index. Verify that your user account has the required permissions.

By checking these potential causes and implementing the necessary changes, you should be able to resolve the issue and successfully create a Vector Search index for your Delta table.

If you have any further queries, feel free to ask! ๐Ÿ˜Š

Thanks for the suggestions, @Kaniz_Fatma!

I tried optimizing my table and batching my data into 500 rows but itโ€™s still not indexing my Delta table. Iโ€™m sure  Ensure that Unity Catalog and serverless compute are properly configured as well.

 

What is the maximum number of rows that can be indexed in Databricks? Is there a way to combine multiple indexes in Databricks? What other indexing strategies are available in Databricks that could be more efficient?

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group