Databricks Community

Galactech · ‎09-20-2024

When using the web UI to create a vector index from an existing table with chunked data the creation fails at the "Initializing" phase after about 20 minutes. I have validated the requirements are satisfied that are detailed here.

I have also seen that there a few posts on the forum for similar issues but the resolution was not clear to me as it detailed either using a small gpu instance (doesn't seem to apply when the pipeline is created an run by serverless compute not necessarily managed by me) or simply waiting for 48 hours for the issue to resolve itself. If there is a way for me to use a small gpu instance for the task I am happy to do that but I don't see any obvious way to do that. It appears that the processed is "managed" to the point where the pipeline is sent for execution and either passes or fails.

I have attempted a few times (even after waiting 48hrs), I have also found a few older posts that detail making sure there is an s3 bucket associated with the metastore, validated permissions, kicked off the pipeline from a notebook and nothing seems to work.

I am running on aws (us-west-2).

Specific log error short version: Failed to resolve flow: '__online_index_view'.

at com.databricks.backend.common.util.TimeUtils$.retryWithExponentialBackoff0(TimeUtils.scala:216)
at com.databricks.backend.common.util.TimeUtils$.retryWithExponentialBackoff(TimeUtils.scala:145)
at com.databricks.pipelines.execution.extensions.brickindex.DatabricksHttpClient.sendRequestWithRetries(DatabricksHttpClient.scala:124)
at com.databricks.pipelines.execution.extensions.brickindex.DatabricksHttpClient.post(DatabricksHttpClient.scala:213)
at com.databricks.pipelines.execution.extensions.brickindex.BrickIndexGatewayClient.$anonfun$makePredictions$2(GatewayClient.scala:338)
at com.databricks.pipelines.execution.extensions.brickindex.BrickIndexGatewayClient.withCredentials(GatewayClient.scala:160)
at com.databricks.pipelines.execution.extensions.brickindex.BrickIndexGatewayClient.makePredictions(GatewayClient.scala:335)
at com.databricks.pipelines.execution.extensions.brickindex.ModelServingBatchProcessor.processViaGateway(ModelServingBatchProcessor.scala:101)
at com.databricks.pipelines.execution.extensions.brickindex.ModelServingBatchProcessor.process(ModelServingBatchProcessor.scala:78)
at com.databricks.pipelines.execution.extensions.brickindex.VectorSearchIngestionProcessor.$anonfun$processIngestionWithConcurrency$6(VectorSearchIngestionProcessor.scala:133)
at com.databricks.pipelines.execution.extensions.brickindex.VectorSearchIngestionProcessor.$anonfun$processIngestionWithConcurrency$6$adapted(VectorSearchIngestionProcessor.scala:132)
at com.databricks.pipelines.execution.extensions.brickindex.VectorSearchIngestionProcessor.$anonfun$processIngestionBatchFuture$1(VectorSearchIngestionProcessor.scala:229)
at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659)
at scala.util.Success.$anonfun$map$1(Try.scala:255)
at scala.util.Success.map(Try.scala:213)
at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)
at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:46)
at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:46)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:77)
at com.databricks.threading.DatabricksExecutionContext$InstrumentedRunnable.run(DatabricksExecutionContext.scala:36)
at com.databricks.threading.NamedExecutor$$anon$2.$anonfun$run$1(NamedExecutor.scala:367)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:426)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:216)
at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:424)
at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:418)
at com.databricks.threading.NamedExecutor.withAttributionContext(NamedExecutor.scala:294)
at com.databricks.threading.NamedExecutor$$anon$2.run(NamedExecutor.scala:365)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)

Galactech · ‎09-28-2024

I believe this "self-resolved". Even though I was technically on a premium plan the trial period had not completed. I think at this point I can say it is resolved.

View solution in original post

Galactech · ‎09-28-2024

I believe this "self-resolved". Even though I was technically on a premium plan the trial period had not completed. I think at this point I can say it is resolved.

Databricks Community

Vector Index Creation Initializing Phase

Join Us as a Local Community Builder!

🚀 Weekly Delta (1 - 7 October): A Look Back at This Week’s Top Community Highlights!

🌟 Community Sparks of the Week | September 26 – October 2 🌟

Solution Accelerator Series | #4 - Toxicity Detection for Gaming

Level Up with Databricks Specialist Sessions

Announcing Data Intelligence for Cybersecurity