cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Streaming inference with Delta Live Tables for a model registered in Unity Catalog

BeadsPlayer
New Contributor II

Hi there, 

I'm trying to run a streaming inference with Delta Live Tables with tables and a model registered in Unity Catalog, but it fails for unclear reasons. 

The DLT pipeline is based on a notebook, the channel is set to 'Preview', presumably running on Runtime 13.3 LTS. 

The code: 

********************************************************************************************

%pip install mlflow[databricks]==2.8.0
%pip install importlib_metadata==4.11.3
%pip install zipp==3.8.0
%pip install MarkupSafe==2.0.1 #2.1.3
%pip install Jinja2==2.11.3

import mlflow
import dlt

from pyspark.sql.functions import struct
from delta.tables import *

--Input Table Name and schema
--source table
catalog = "aiml"
database = "titanic"
input_table_name = "delta_live_infer_input"
input_table_name_full = f"{catalog}.{database}.{input_table_name}"

mlflow.set_registry_uri('databricks-uc')

model_name = 'aiml.titanic.dev-titanic-model'
model_uri = f"models:/{model_name}/2"

target_column = 'Survived_prediction'
id_column = 'PassengerId'
output_cols = [id_column, target_column]

input_delta_table = DeltaTable.forName(spark, input_table_name_full)

--The input table schema stored as an array of strings. This is used to pass in the schema to the model predict udf.
input_dlt_table_columns = input_delta_table.toDF().columns

--create spark user-defined function for model prediction.
--Note: : Here we use virtualenv to restore the python environment that was used to train the model.
predict = mlflow.pyfunc.spark_udf(spark, model_uri, result_type="double", env_manager='virtualenv')

@dlt.table(
comment=f"DLT for predictions scored by {model_name} based on {input_table_name} Delta table.",
table_properties={
"quality": "gold"
}
)
def delta_live_predictions():
return (
spark.readStream.table(input_table_name_full)
.withColumn(target_column, predict(struct(input_dlt_table_columns)))
.select(output_cols)
)

 

********************************************************************************************

 

 

The model is a spark logistic regression.  

I had to add the installment of specific versions of packages otherwise the pipeline would fail, complaining that those packages are missing, had to figure out which one to specify. 

 

This works fine for models and tables not in Unity Catalog, but with Unity Catalog it returns the error below.
 
The model was trained and logged with mlflow==2.8.0, Runtime 14.2 ML. I tried mlflow[databricks] versions 2.4.1, 2.5.0, 2.6.0, 2.7.1, 2.8.0 - all the same. Looks like the missing dependency 'GLIBC_2.3X' prevents mlflow from starting the virtual env. 
 
What I'm doing wrong?
 
***********************Traceback**********************************************
 
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = e3153759-7718-4993-b5fe-caaf8881c8cd, runId = 9adca3c0-925b-46ba-a56c-afa6cf5f0bdd] terminated with exception: Exception thrown in awaitResult: Job aborted due to stage failure: Task 7 in stage 87.0 failed 4 times, most recent failure: Lost task 7.3 in stage 87.0 (TID 197) (10.1.4.10 executor 0): org.apache.spark.SparkRuntimeException: [UDF_USER_CODE_ERROR.GENERIC] Execution of function udf(named_struct(PassengerId, PassengerId#5911, Sex, Sex#5912, Age, Age#5913, Fare, Fare#5914, Pclass, Pclass#5915, Family_cnt, Family_cnt#5916, Cabin_ind, Cabin_ind#5917)) failed. 
== Error ==
mlflow.exceptions.MlflowException: During spark UDF task execution, mlflow model server failed to launch. MLflow model server output:
/local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/virtualenv_envs/mlflow-0be5b9a8b81d469722f3d82be553c02bfe5b71ab/bin/python: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/virtualenv_envs/mlflow-0be5b9a8b81d469722f3d82be553c02bfe5b71ab/bin/python)
/local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/virtualenv_envs/mlflow-0be5b9a8b81d469722f3d82be553c02bfe5b71ab/bin/python: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.35' not found (required by /local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/pyenv_root/versions/3.10.12/lib/libpython3.10.so.1.0)
/local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/virtualenv_envs/mlflow-0be5b9a8b81d469722f3d82be553c02bfe5b71ab/bin/python: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.33' not found (required by /local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/pyenv_root/versions/3.10.12/lib/libpython3.10.so.1.0)
/local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/virtualenv_envs/mlflow-0be5b9a8b81d469722f3d82be553c02bfe5b71ab/bin/python: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by /local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/pyenv_root/versions/3.10.12/lib/libpython3.10.so.1.0)
/local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/virtualenv_envs/mlflow-0be5b9a8b81d469722f3d82be553c02bfe5b71ab/bin/python: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/pyenv_root/versions/3.10.12/lib/libpython3.10.so.1.0)
== Stacktrace ==
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-1a41553e-c975-4f29-ac42-ba4262b5bb4e/lib/python3.10/site-packages/mlflow/pyfunc/__init__.py", line 1266, in udf
    raise MlflowException(err_msg) from e
at org.apache.spark.sql.errors.QueryExecutionErrors$.failedExecuteUserDefinedFunctionSafeSpark(QueryExecutionErrors.scala:258)
at com.databricks.sql.execution.safespark.EvalExternalUDFExec.awaitBatchResult(EvalExternalUDFExec.scala:258)
at com.databricks.sql.execution.safespark.EvalExternalUDFExec.$anonfun$doExecute$12(EvalExternalUDFExec.scala:204)
at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at org.apache.spark.shuffle.sort.UnsafeShuffleWriter.write(UnsafeShuffleWriter.java:195)
at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:57)
at org.apache.spark.scheduler.ShuffleMapTask.$anonfun$runTask$3(ShuffleMapTask.scala:92)
at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
at org.apache.spark.scheduler.ShuffleMapTask.$anonfun$runTask$1(ShuffleMapTask.scala:87)
at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:58)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:39)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:196)
at org.apache.spark.scheduler.Task.doRunTask(Task.scala:181)
at org.apache.spark.scheduler.Task.$anonfun$run$5(Task.scala:146)
at com.databricks.unity.UCSEphemeralState$Handle.runWith(UCSEphemeralState.scala:41)
at com.databricks.unity.HandleImpl.runWith(UCSHandle.scala:99)
at com.databricks.unity.HandleImpl.$anonfun$runWithAndClose$1(UCSHandle.scala:104)
at scala.util.Using$.resource(Using.scala:269)
at com.databricks.unity.HandleImpl.runWithAndClose(UCSHandle.scala:103)
at org.apache.spark.scheduler.Task.$anonfun$run$1(Task.scala:146)
at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$8(Executor.scala:930)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:102)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:933)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:825)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
 
Driver stacktrace:
org.apache.spark.SparkException: Exception thrown in awaitResult: Job aborted due to stage failure: Task 7 in stage 87.0 failed 4 times, most recent failure: Lost task 7.3 in stage 87.0 (TID 197) (10.1.4.10 executor 0): org.apache.spark.SparkRuntimeException: [UDF_USER_CODE_ERROR.GENERIC] Execution of function udf(named_struct(PassengerId, PassengerId#5911, Sex, Sex#5912, Age, Age#5913, Fare, Fare#5914, Pclass, Pclass#5915, Family_cnt, Family_cnt#5916, Cabin_ind, Cabin_ind#5917)) failed. 
== Error ==
mlflow.exceptions.MlflowException: During spark UDF task execution, mlflow model server failed to launch. MLflow model server output:
/local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/virtualenv_envs/mlflow-0be5b9a8b81d469722f3d82be553c02bfe5b71ab/bin/python: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/virtualenv_envs/mlflow-0be5b9a8b81d469722f3d82be553c02bfe5b71ab/bin/python)
/local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/virtualenv_envs/mlflow-0be5b9a8b81d469722f3d82be553c02bfe5b71ab/bin/python: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.35' not found (required by /local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/pyenv_root/versions/3.10.12/lib/libpython3.10.so.1.0)
/local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/virtualenv_envs/mlflow-0be5b9a8b81d469722f3d82be553c02bfe5b71ab/bin/python: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.33' not found (required by /local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/pyenv_root/versions/3.10.12/lib/libpython3.10.so.1.0)
/local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/virtualenv_envs/mlflow-0be5b9a8b81d469722f3d82be553c02bfe5b71ab/bin/python: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by /local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/pyenv_root/versions/3.10.12/lib/libpython3.10.so.1.0)
/local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/virtualenv_envs/mlflow-0be5b9a8b81d469722f3d82be553c02bfe5b71ab/bin/python: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/pyenv_root/versions/3.10.12/lib/libpython3.10.so.1.0)
== Stacktrace ==
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-1a41553e-c975-4f29-ac42-ba4262b5bb4e/lib/python3.10/site-packages/mlflow/pyfunc/__init__.py", line 1266, in udf
    raise MlflowException(err_msg) from e
at org.apache.spark.sql.errors.QueryExecutionErrors$.failedExecuteUserDefinedFunctionSafeSpark(QueryExecutionErrors.scala:258)
at com.databricks.sql.execution.safespark.EvalExternalUDFExec.awaitBatchResult(EvalExternalUDFExec.scala:258)
at com.databricks.sql.execution.safespark.EvalExternalUDFExec.$anonfun$doExecute$12(EvalExternalUDFExec.scala:204)
at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at org.apache.spark.shuffle.sort.UnsafeShuffleWriter.write(UnsafeShuffleWriter.java:195)
at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:57)
at org.apache.spark.scheduler.ShuffleMapTask.$anonfun$runTask$3(ShuffleMapTask.scala:92)
at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
at org.apache.spark.scheduler.ShuffleMapTask.$anonfun$runTask$1(ShuffleMapTask.scala:87)
at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:58)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:39)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:196)
at org.apache.spark.scheduler.Task.doRunTask(Task.scala:181)
at org.apache.spark.scheduler.Task.$anonfun$run$5(Task.scala:146)
at com.databricks.unity.UCSEphemeralState$Handle.runWith(UCSEphemeralState.scala:41)
at com.databricks.unity.HandleImpl.runWith(UCSHandle.scala:99)
at com.databricks.unity.HandleImpl.$anonfun$runWithAndClose$1(UCSHandle.scala:104)
at scala.util.Using$.resource(Using.scala:269)
at com.databricks.unity.HandleImpl.runWithAndClose(UCSHandle.scala:103)
at org.apache.spark.scheduler.Task.$anonfun$run$1(Task.scala:146)
at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$8(Executor.scala:930)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:102)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:933)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:825)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
 
Driver stacktrace:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 7 in stage 87.0 failed 4 times, most recent failure: Lost task 7.3 in stage 87.0 (TID 197) (10.1.4.10 executor 0): org.apache.spark.SparkRuntimeException: [UDF_USER_CODE_ERROR.GENERIC] Execution of function udf(named_struct(PassengerId, PassengerId#5911, Sex, Sex#5912, Age, Age#5913, Fare, Fare#5914, Pclass, Pclass#5915, Family_cnt, Family_cnt#5916, Cabin_ind, Cabin_ind#5917)) failed. 
== Error ==
mlflow.exceptions.MlflowException: During spark UDF task execution, mlflow model server failed to launch. MLflow model server output:
/local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/virtualenv_envs/mlflow-0be5b9a8b81d469722f3d82be553c02bfe5b71ab/bin/python: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/virtualenv_envs/mlflow-0be5b9a8b81d469722f3d82be553c02bfe5b71ab/bin/python)
/local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/virtualenv_envs/mlflow-0be5b9a8b81d469722f3d82be553c02bfe5b71ab/bin/python: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.35' not found (required by /local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/pyenv_root/versions/3.10.12/lib/libpython3.10.so.1.0)
/local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/virtualenv_envs/mlflow-0be5b9a8b81d469722f3d82be553c02bfe5b71ab/bin/python: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.33' not found (required by /local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/pyenv_root/versions/3.10.12/lib/libpython3.10.so.1.0)
/local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/virtualenv_envs/mlflow-0be5b9a8b81d469722f3d82be553c02bfe5b71ab/bin/python: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by /local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/pyenv_root/versions/3.10.12/lib/libpython3.10.so.1.0)
/local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/virtualenv_envs/mlflow-0be5b9a8b81d469722f3d82be553c02bfe5b71ab/bin/python: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/pyenv_root/versions/3.10.12/lib/libpython3.10.so.1.0)
== Stacktrace ==
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-1a41553e-c975-4f29-ac42-ba4262b5bb4e/lib/python3.10/site-packages/mlflow/pyfunc/__init__.py", line 1266, in udf
    raise MlflowException(err_msg) from e
at org.apache.spark.sql.errors.QueryExecutionErrors$.failedExecuteUserDefinedFunctionSafeSpark(QueryExecutionErrors.scala:258)
at com.databricks.sql.execution.safespark.EvalExternalUDFExec.awaitBatchResult(EvalExternalUDFExec.scala:258)
at com.databricks.sql.execution.safespark.EvalExternalUDFExec.$anonfun$doExecute$12(EvalExternalUDFExec.scala:204)
at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at org.apache.spark.shuffle.sort.UnsafeShuffleWriter.write(UnsafeShuffleWriter.java:195)
at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:57)
at org.apache.spark.scheduler.ShuffleMapTask.$anonfun$runTask$3(ShuffleMapTask.scala:92)
at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
at org.apache.spark.scheduler.ShuffleMapTask.$anonfun$runTask$1(ShuffleMapTask.scala:87)
at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:58)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:39)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:196)
at org.apache.spark.scheduler.Task.doRunTask(Task.scala:181)
at org.apache.spark.scheduler.Task.$anonfun$run$5(Task.scala:146)
at com.databricks.unity.UCSEphemeralState$Handle.runWith(UCSEphemeralState.scala:41)
at com.databricks.unity.HandleImpl.runWith(UCSHandle.scala:99)
at com.databricks.unity.HandleImpl.$anonfun$runWithAndClose$1(UCSHandle.scala:104)
at scala.util.Using$.resource(Using.scala:269)
at com.databricks.unity.HandleImpl.runWithAndClose(UCSHandle.scala:103)
at org.apache.spark.scheduler.Task.$anonfun$run$1(Task.scala:146)
at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$8(Executor.scala:930)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:102)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:933)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:825)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
 
Driver stacktrace:
org.apache.spark.SparkRuntimeException: [UDF_USER_CODE_ERROR.GENERIC] Execution of function udf(named_struct(PassengerId, PassengerId#5911, Sex, Sex#5912, Age, Age#5913, Fare, Fare#5914, Pclass, Pclass#5915, Family_cnt, Family_cnt#5916, Cabin_ind, Cabin_ind#5917)) failed. 
== Error ==
mlflow.exceptions.MlflowException: During spark UDF task execution, mlflow model server failed to launch. MLflow model server output:
/local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/virtualenv_envs/mlflow-0be5b9a8b81d469722f3d82be553c02bfe5b71ab/bin/python: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/virtualenv_envs/mlflow-0be5b9a8b81d469722f3d82be553c02bfe5b71ab/bin/python)
/local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/virtualenv_envs/mlflow-0be5b9a8b81d469722f3d82be553c02bfe5b71ab/bin/python: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.35' not found (required by /local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/pyenv_root/versions/3.10.12/lib/libpython3.10.so.1.0)
/local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/virtualenv_envs/mlflow-0be5b9a8b81d469722f3d82be553c02bfe5b71ab/bin/python: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.33' not found (required by /local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/pyenv_root/versions/3.10.12/lib/libpython3.10.so.1.0)
/local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/virtualenv_envs/mlflow-0be5b9a8b81d469722f3d82be553c02bfe5b71ab/bin/python: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by /local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/pyenv_root/versions/3.10.12/lib/libpython3.10.so.1.0)
/local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/virtualenv_envs/mlflow-0be5b9a8b81d469722f3d82be553c02bfe5b71ab/bin/python: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-3bc28-e5133-418de-6/mlflow/envs/pyenv_root/versions/3.10.12/lib/libpython3.10.so.1.0)
== Stacktrace ==
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-1a41553e-c975-4f29-ac42-ba4262b5bb4e/lib/python3.10/site-packages/mlflow/pyfunc/__init__.py", line 1266, in udf
    raise MlflowException(err_msg) from e

*******************************************************************************************************************************8

 
0 REPLIES 0

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group