07-29-2024 03:48 AM
When setting up a vector search in databricks using the bge_m3 (Version 1) embedding model available in system.ai schema, the setup runs for 20 minutes or so and then fails. Querying the served embedding models from the browser works perfectly fine.
The exact same data worked in the past (although in a different workspace), I've retried several times, over a longer period of time, so this does not seem to be a temporary issue.
The flow_progress step in the pipeline creating fails with
Failed to resolve flow: '__online_index_view'
and error details:
java.lang.Exception: Error: Response Code: 400, Response: {"error_code":"INVALID_PARAMETER_VALUE","message":"Failed to call Model Serving endpoint: bge_m3_embedding."}
at com.databricks.pipelines.execution.extensions.brickindex.DatabricksHttpClient.$anonfun$sendRequestWithRetries$5(DatabricksHttpClient.scala:129)
at com.databricks.pipelines.execution.extensions.brickindex.DatabricksHttpClient.$anonfun$sendRequestWithRetries$5$adapted(DatabricksHttpClient.scala:121)
at scala.util.Using$.resource(Using.scala:269)
at com.databricks.pipelines.execution.extensions.brickindex.DatabricksHttpClient.$anonfun$sendRequestWithRetries$4(DatabricksHttpClient.scala:121)
at com.databricks.backend.common.util.TimeUtils$.$anonfun$retryWithExponentialBackoff0$1(TimeUtils.scala:191)
at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23)
at scala.util.Try$.apply(Try.scala:213)
at com.databricks.backend.common.util.TimeUtils$.retryWithExponentialBackoff0(TimeUtils.scala:191)
at com.databricks.backend.common.util.TimeUtils$.retryWithExponentialBackoff(TimeUtils.scala:145)
at com.databricks.pipelines.execution.extensions.brickindex.DatabricksHttpClient.sendRequestWithRetries(DatabricksHttpClient.scala:120)
at com.databricks.pipelines.execution.extensions.brickindex.DatabricksHttpClient.post(DatabricksHttpClient.scala:209)
at com.databricks.pipelines.execution.extensions.brickindex.BrickIndexGatewayClient.$anonfun$makePredictions$2(GatewayClient.scala:335)
at com.databricks.pipelines.execution.extensions.brickindex.BrickIndexGatewayClient.withCredentials(GatewayClient.scala:157)
at com.databricks.pipelines.execution.extensions.brickindex.BrickIndexGatewayClient.makePredictions(GatewayClient.scala:332)
at com.databricks.pipelines.execution.extensions.brickindex.ModelServingBatchProcessor.processViaGateway(ModelServingBatchProcessor.scala:96)
at com.databricks.pipelines.execution.extensions.brickindex.ModelServingBatchProcessor.process(ModelServingBatchProcessor.scala:75)
at com.databricks.pipelines.execution.extensions.brickindex.VectorSearchIngestionProcessor.$anonfun$processIngestionWithConcurrency$6(VectorSearchIngestionProcessor.scala:125)
at com.databricks.pipelines.execution.extensions.brickindex.VectorSearchIngestionProcessor.$anonfun$processIngestionWithConcurrency$6$adapted(VectorSearchIngestionProcessor.scala:125)
at com.databricks.pipelines.execution.extensions.brickindex.VectorSearchIngestionProcessor.$anonfun$processIngestionBatchFuture$1(VectorSearchIngestionProcessor.scala:216)
at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659)
at scala.util.Success.$anonfun$map$1(Try.scala:255)
at scala.util.Success.map(Try.scala:213)
at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)
at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:46)
at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:46)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:77)
at com.databricks.threading.DatabricksExecutionContext$InstrumentedRunnable.run(DatabricksExecutionContext.scala:36)
at com.databricks.threading.NamedExecutor$$anon$2.$anonfun$run$1(NamedExecutor.scala:367)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:426)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:216)
at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:424)
at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:418)
at com.databricks.threading.NamedExecutor.withAttributionContext(NamedExecutor.scala:294)
at com.databricks.threading.NamedExecutor$$anon$2.run(NamedExecutor.scala:365)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Any ideas what the problem might be?
07-31-2024 01:50 AM
The issue was most likely to use a CPU compute for the deployed model, switching to GPU (small) solved the issue.
07-29-2024 11:15 AM
I would double check what specific values are being sent to the model in the workflow. Possibly transitioning to environments changed a value's type or possibly data isn't defined correctly leaving certain parameters empty?
The "INVALID_PARAMETER_VALUE" from the embedding output makes me believe something isn't being set correctly in the workflow when accessing the endpoint programmatically.
07-30-2024 12:50 AM
Hi taylor-xorbix,
I'm not defining a workflow manually or setting any environment variables. I'm using the databricks UI (so from the unity catalog I'm using the create/vector search index dropdown. Having a running (and working) bge_m3 endpoint.
Looking at the example from the UI for the served embedding model, it seems that the API now specifies "inputs". If I remember correctly previously is was called "message" at some point in the past, which would explain the error message above.
The thing is that I'm not installing anything manually and just using the databricks UI functionality, so this should all work together
07-31-2024 01:50 AM
The issue was most likely to use a CPU compute for the deployed model, switching to GPU (small) solved the issue.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group