<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: DLT pipeline MLFlow UDF error in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/dlt-pipeline-mlflow-udf-error/m-p/51981#M29376</link>
    <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/42997"&gt;@BarryC&lt;/a&gt;&amp;nbsp;this has worked.&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;%pip install importlib-metadata==4.11.3
%pip instal zipp=3.8.0&lt;/LI-CODE&gt;&lt;P&gt;Adding this at the start of your DLT UDF register notebook will solve the issue.&lt;BR /&gt;&lt;BR /&gt;Databricks is advocating in all docs and tutorials to use DLT for ML inference, but this is a standard incompatibility inherent to the setup. I hope Databricks will take action and resolve this asap.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Kind regards,&lt;/P&gt;&lt;P&gt;Data Interlaced ltd&lt;BR /&gt;&lt;BR /&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Wed, 15 Nov 2023 07:33:11 GMT</pubDate>
    <dc:creator>Data_Interlaced</dc:creator>
    <dc:date>2023-11-15T07:33:11Z</dc:date>
    <item>
      <title>DLT pipeline MLFlow UDF error</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-pipeline-mlflow-udf-error/m-p/50533#M28822</link>
      <description>&lt;P&gt;I am running this notebook via the dlt pipeline in preview mode.&lt;/P&gt;&lt;P&gt;everything works up until the predictions table that should be created with a registered model inferencing the gold table.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Feather_0-1699311273694.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/4739iD1104E242B148439/image-size/medium/is-moderation-mode/true?v=v2&amp;amp;px=400" role="button" title="Feather_0-1699311273694.png" alt="Feather_0-1699311273694.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&amp;nbsp;This is the&amp;nbsp; error:&amp;nbsp;&lt;STRONG&gt;com databricks spark safespark UDFException: INVALID_ARGUMENT: No module named 'importlib_metadata'&lt;/STRONG&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;# Databricks notebook source&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;# MAGIC %pip install mlflow&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;# MAGIC %pip install importlib_metadata&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;# COMMAND ----------&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;import&lt;/SPAN&gt; &lt;SPAN&gt;mlflow&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;import&lt;/SPAN&gt; &lt;SPAN&gt;importlib_metadata&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;model_uri&lt;/SPAN&gt; &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;f&lt;/SPAN&gt;&lt;SPAN&gt;"models:/soybeans_volatility/1"&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;# create spark user-defined function for model prediction.&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;predict&lt;/SPAN&gt; &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;mlflow&lt;/SPAN&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;SPAN&gt;pyfunc&lt;/SPAN&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;SPAN&gt;spark_udf&lt;/SPAN&gt;&lt;SPAN&gt;(spark, &lt;/SPAN&gt;&lt;SPAN&gt;model_uri&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;result_type&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;"double"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;env_manager&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;'virtualenv'&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;# COMMAND ----------&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;import&lt;/SPAN&gt; &lt;SPAN&gt;dlt&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;from&lt;/SPAN&gt; &lt;SPAN&gt;pyspark&lt;/SPAN&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;SPAN&gt;sql&lt;/SPAN&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;SPAN&gt;functions&lt;/SPAN&gt; &lt;SPAN&gt;import&lt;/SPAN&gt; &lt;SPAN&gt;avg&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;max&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;min&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;col&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;lag&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;count&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;when&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;struct&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;from_unixtime&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;unix_timestamp&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;from&lt;/SPAN&gt; &lt;SPAN&gt;pyspark&lt;/SPAN&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;SPAN&gt;sql&lt;/SPAN&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;SPAN&gt;window&lt;/SPAN&gt; &lt;SPAN&gt;import&lt;/SPAN&gt; &lt;SPAN&gt;Window&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;path_to_uc_external_location&lt;/SPAN&gt; &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;"s3://gfy-databricks-storage/data/barchart/soybeans/"&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;@&lt;/SPAN&gt;&lt;SPAN&gt;dlt&lt;/SPAN&gt;&lt;SPAN&gt;.table&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;name&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;"soybeans_bronze"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;table_properties&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;SPAN&gt;"quality"&lt;/SPAN&gt;&lt;SPAN&gt;: &lt;/SPAN&gt;&lt;SPAN&gt;"bronze"&lt;/SPAN&gt;&lt;SPAN&gt;})&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;def&lt;/SPAN&gt; &lt;SPAN&gt;table_name&lt;/SPAN&gt;&lt;SPAN&gt;():&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;return&lt;/SPAN&gt;&lt;SPAN&gt; (&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;spark.readStream&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;.format(&lt;/SPAN&gt;&lt;SPAN&gt;"cloudFiles"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;.option(&lt;/SPAN&gt;&lt;SPAN&gt;"cloudFiles.format"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;"json"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;.load(&lt;/SPAN&gt;&lt;SPAN&gt;f&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;SPAN&gt;path_to_uc_external_location&lt;/SPAN&gt;&lt;SPAN&gt;}&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;@&lt;/SPAN&gt;&lt;SPAN&gt;dlt&lt;/SPAN&gt;&lt;SPAN&gt;.table&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;name&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;"soybeans_silver"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;table_properties&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;SPAN&gt;"quality"&lt;/SPAN&gt;&lt;SPAN&gt;: &lt;/SPAN&gt;&lt;SPAN&gt;"silver"&lt;/SPAN&gt;&lt;SPAN&gt;})&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;def&lt;/SPAN&gt; &lt;SPAN&gt;create_silver_table&lt;/SPAN&gt;&lt;SPAN&gt;():&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;df&lt;/SPAN&gt; &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;dlt&lt;/SPAN&gt;&lt;SPAN&gt;.read(&lt;/SPAN&gt;&lt;SPAN&gt;"soybeans_bronze"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;cleaned_df&lt;/SPAN&gt; &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;df&lt;/SPAN&gt;&lt;SPAN&gt;.drop(&lt;/SPAN&gt;&lt;SPAN&gt;"_rescued_data"&lt;/SPAN&gt;&lt;SPAN&gt;).filter(&lt;/SPAN&gt;&lt;SPAN&gt;col&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"close"&lt;/SPAN&gt;&lt;SPAN&gt;).&lt;/SPAN&gt;&lt;SPAN&gt;isNotNull&lt;/SPAN&gt;&lt;SPAN&gt;())&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;formatted_df&lt;/SPAN&gt; &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;cleaned_df&lt;/SPAN&gt;&lt;SPAN&gt;.withColumn(&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;"tradeTimestamp"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;from_unixtime&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;unix_timestamp&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;col&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"timestamp"&lt;/SPAN&gt;&lt;SPAN&gt;), &lt;/SPAN&gt;&lt;SPAN&gt;"yyyy-MM-dd'T'HH:mm:ssXXX"&lt;/SPAN&gt;&lt;SPAN&gt;), &lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;"yyyy-MM-dd HH:mm:ss.SSS"&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;return&lt;/SPAN&gt; &lt;SPAN&gt;formatted_df&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;@&lt;/SPAN&gt;&lt;SPAN&gt;dlt&lt;/SPAN&gt;&lt;SPAN&gt;.table&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;name&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;"soybeans_gold"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;table_properties&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;SPAN&gt;"quality"&lt;/SPAN&gt;&lt;SPAN&gt;: &lt;/SPAN&gt;&lt;SPAN&gt;"gold"&lt;/SPAN&gt;&lt;SPAN&gt;})&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;def&lt;/SPAN&gt; &lt;SPAN&gt;create_gold_table&lt;/SPAN&gt;&lt;SPAN&gt;():&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;df_silver&lt;/SPAN&gt; &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;dlt&lt;/SPAN&gt;&lt;SPAN&gt;.read(&lt;/SPAN&gt;&lt;SPAN&gt;"soybeans_silver"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;# Compute a 7-day rolling average of the close price&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;windowSpec&lt;/SPAN&gt; &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;Window&lt;/SPAN&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;SPAN&gt;partitionBy&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"symbol"&lt;/SPAN&gt;&lt;SPAN&gt;).&lt;/SPAN&gt;&lt;SPAN&gt;orderBy&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"tradeTimestamp"&lt;/SPAN&gt;&lt;SPAN&gt;).&lt;/SPAN&gt;&lt;SPAN&gt;rowsBetween&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;-&lt;/SPAN&gt;&lt;SPAN&gt;6&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;0&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;avg_price&lt;/SPAN&gt; &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;avg&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;col&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"close"&lt;/SPAN&gt;&lt;SPAN&gt;).&lt;/SPAN&gt;&lt;SPAN&gt;cast&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"double"&lt;/SPAN&gt;&lt;SPAN&gt;)).&lt;/SPAN&gt;&lt;SPAN&gt;over&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;windowSpec&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;# Compute daily volatility&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;daily_volatility&lt;/SPAN&gt; &lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; (&lt;/SPAN&gt;&lt;SPAN&gt;max&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;col&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"high"&lt;/SPAN&gt;&lt;SPAN&gt;).&lt;/SPAN&gt;&lt;SPAN&gt;cast&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"double"&lt;/SPAN&gt;&lt;SPAN&gt;)).&lt;/SPAN&gt;&lt;SPAN&gt;over&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;windowSpec&lt;/SPAN&gt;&lt;SPAN&gt;) &lt;/SPAN&gt;&lt;SPAN&gt;-&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;min&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;col&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"low"&lt;/SPAN&gt;&lt;SPAN&gt;).&lt;/SPAN&gt;&lt;SPAN&gt;cast&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"double"&lt;/SPAN&gt;&lt;SPAN&gt;)).&lt;/SPAN&gt;&lt;SPAN&gt;over&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;windowSpec&lt;/SPAN&gt;&lt;SPAN&gt;))&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;# Extract previous day's volume&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;lag_window&lt;/SPAN&gt; &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;Window&lt;/SPAN&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;SPAN&gt;partitionBy&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"symbol"&lt;/SPAN&gt;&lt;SPAN&gt;).&lt;/SPAN&gt;&lt;SPAN&gt;orderBy&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"tradeTimestamp"&lt;/SPAN&gt;&lt;SPAN&gt;).&lt;/SPAN&gt;&lt;SPAN&gt;rowsBetween&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;-&lt;/SPAN&gt;&lt;SPAN&gt;1&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;-&lt;/SPAN&gt;&lt;SPAN&gt;1&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;prev_day_volume&lt;/SPAN&gt; &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;lag&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;col&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"volume"&lt;/SPAN&gt;&lt;SPAN&gt;), &lt;/SPAN&gt;&lt;SPAN&gt;1&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;0&lt;/SPAN&gt;&lt;SPAN&gt;).&lt;/SPAN&gt;&lt;SPAN&gt;over&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;lag_window&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;df_gold&lt;/SPAN&gt; &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;df_silver&lt;/SPAN&gt;&lt;SPAN&gt;.withColumn(&lt;/SPAN&gt;&lt;SPAN&gt;"7_day_avg_close"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;avg_price&lt;/SPAN&gt;&lt;SPAN&gt;) \&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;.withColumn(&lt;/SPAN&gt;&lt;SPAN&gt;"daily_volatility"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;daily_volatility&lt;/SPAN&gt;&lt;SPAN&gt;) \&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;.withColumn(&lt;/SPAN&gt;&lt;SPAN&gt;"prev_day_volume"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;prev_day_volume&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;return&lt;/SPAN&gt; &lt;SPAN&gt;df_gold&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;@&lt;/SPAN&gt;&lt;SPAN&gt;dlt&lt;/SPAN&gt;&lt;SPAN&gt;.table&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;comment&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;"DLT for predictions scored by soybeans_volatility model based on models.soybeans.soybeans_gold Delta table."&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;name&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;"soybeans_gold_preds"&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;table_properties&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;"quality"&lt;/SPAN&gt;&lt;SPAN&gt;: &lt;/SPAN&gt;&lt;SPAN&gt;"gold"&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;}&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;def&lt;/SPAN&gt; &lt;SPAN&gt;soybeans_volatility_predictions&lt;/SPAN&gt;&lt;SPAN&gt;():&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;input_dlt_table_name&lt;/SPAN&gt; &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;"soybeans_gold"&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;input_delta_live_table&lt;/SPAN&gt; &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;dlt&lt;/SPAN&gt;&lt;SPAN&gt;.read(&lt;/SPAN&gt;&lt;SPAN&gt;input_dlt_table_name&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;input_dlt_table_columns&lt;/SPAN&gt; &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;input_delta_live_table&lt;/SPAN&gt;&lt;SPAN&gt;.columns&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;predictions_df&lt;/SPAN&gt; &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;input_delta_live_table&lt;/SPAN&gt;&lt;SPAN&gt;.withColumn(&lt;/SPAN&gt;&lt;SPAN&gt;'prediction'&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;predict&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;struct&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt;input_dlt_table_columns&lt;/SPAN&gt;&lt;SPAN&gt;)))&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;return&lt;/SPAN&gt; &lt;SPAN&gt;predictions_df&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;I've tried everything, I've removed the virtualenv and ran the pipeline in current mode (non preview) but no luck.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;This is what I'm using as a guide:&amp;nbsp;&lt;A href="https://docs.databricks.com/en/delta-live-tables/transform.html" target="_blank"&gt;https://docs.databricks.com/en/delta-live-tables/transform.html&lt;/A&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Feather_1-1699311414386.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/4740i3DA6861928E8A72D/image-size/medium/is-moderation-mode/true?v=v2&amp;amp;px=400" role="button" title="Feather_1-1699311414386.png" alt="Feather_1-1699311414386.png" /&gt;&lt;/span&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Mon, 06 Nov 2023 22:57:25 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-pipeline-mlflow-udf-error/m-p/50533#M28822</guid>
      <dc:creator>Feather</dc:creator>
      <dc:date>2023-11-06T22:57:25Z</dc:date>
    </item>
    <item>
      <title>Re: DLT pipeline MLFlow UDF error</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-pipeline-mlflow-udf-error/m-p/51039#M28939</link>
      <description>&lt;P&gt;I have the same problem. A simple python UDF function in the DLT. Says it can't find&amp;nbsp;&lt;STRONG&gt;importlib_metadata&lt;/STRONG&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 13 Nov 2023 09:35:16 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-pipeline-mlflow-udf-error/m-p/51039#M28939</guid>
      <dc:creator>Data_Interlaced</dc:creator>
      <dc:date>2023-11-13T09:35:16Z</dc:date>
    </item>
    <item>
      <title>Re: DLT pipeline MLFlow UDF error</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-pipeline-mlflow-udf-error/m-p/51444#M29150</link>
      <description>&lt;P&gt;what does your code look like? i am going to attend office hours&lt;/P&gt;</description>
      <pubDate>Tue, 14 Nov 2023 03:42:17 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-pipeline-mlflow-udf-error/m-p/51444#M29150</guid>
      <dc:creator>Feather</dc:creator>
      <dc:date>2023-11-14T03:42:17Z</dc:date>
    </item>
    <item>
      <title>Re: DLT pipeline MLFlow UDF error</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-pipeline-mlflow-udf-error/m-p/51502#M29169</link>
      <description>&lt;P&gt;&lt;SPAN&gt;&lt;SPAN&gt;&lt;STRONG&gt;UDF registry file:&lt;BR /&gt;&lt;BR /&gt;&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;# Databricks notebook source
# MAGIC %pip install mlflow

# COMMAND ----------

import dlt

# COMMAND ----------

import mlflow
from pyspark.sql.functions import struct, col
logged_model = 'runs:/xxxx/model_overstort_all_devices'

# Load model as a Spark UDF. Override result_type if the model does not return double values.
loaded_model = mlflow.pyfunc.spark_udf(spark, model_uri=logged_model, result_type='double')

spark.udf.register("detect_anomaly", loaded_model)

Actual pipeline code:
CREATE OR REFRESH STREAMING LIVE TABLE iot_overstort_predict
COMMENT "Predicting anomalies in measurement data using the Isolation Forest model"
TBLPROPERTIES ("quality" = "gold")
AS SELECT 
    utc_timestamp,
    measurement_value_float,
    topic, 
    YEAR(utc_timestamp) as year, 
    MONTH(utc_timestamp) as month, 
    DAY(utc_timestamp) as day, 
    hour(utc_timestamp) as hour,
    detect_anomaly(measurement_value_float)
FROM STREAM(aqf_tda.test.xxx)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Actual pipeline:&lt;BR /&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;CREATE&lt;/SPAN&gt; &lt;SPAN&gt;OR&lt;/SPAN&gt; &lt;SPAN&gt;REFRESH&lt;/SPAN&gt; &lt;SPAN&gt;STREAMING&lt;/SPAN&gt; &lt;SPAN&gt;LIVE&lt;/SPAN&gt; &lt;SPAN&gt;TABLE&lt;/SPAN&gt;&lt;SPAN&gt; iot_overstort_predict&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;COMMENT&lt;/SPAN&gt; &lt;SPAN&gt;"Predicting anomalies in measurement data using the Isolation Forest model"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;TBLPROPERTIES&lt;/SPAN&gt;&lt;SPAN&gt; (&lt;/SPAN&gt;&lt;SPAN&gt;"quality"&lt;/SPAN&gt; &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;"gold"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;AS&lt;/SPAN&gt; &lt;SPAN&gt;SELECT&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; utc_timestamp,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; measurement_value_float,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; topic, &lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;YEAR&lt;/SPAN&gt;&lt;SPAN&gt;(utc_timestamp) &lt;/SPAN&gt;&lt;SPAN&gt;as&lt;/SPAN&gt; &lt;SPAN&gt;year&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;MONTH&lt;/SPAN&gt;&lt;SPAN&gt;(utc_timestamp) &lt;/SPAN&gt;&lt;SPAN&gt;as&lt;/SPAN&gt; &lt;SPAN&gt;month&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;DAY&lt;/SPAN&gt;&lt;SPAN&gt;(utc_timestamp) &lt;/SPAN&gt;&lt;SPAN&gt;as&lt;/SPAN&gt; &lt;SPAN&gt;day&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;hour&lt;/SPAN&gt;&lt;SPAN&gt;(utc_timestamp) &lt;/SPAN&gt;&lt;SPAN&gt;as&lt;/SPAN&gt; &lt;SPAN&gt;hour&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; detect_anomaly(measurement_value_float)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;FROM&lt;/SPAN&gt; &lt;SPAN&gt;STREAM&lt;/SPAN&gt;&lt;SPAN&gt;(aqf_tda.test.iot_overstort_sample_v1)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;&lt;STRONG&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/STRONG&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 14 Nov 2023 07:51:21 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-pipeline-mlflow-udf-error/m-p/51502#M29169</guid>
      <dc:creator>Data_Interlaced</dc:creator>
      <dc:date>2023-11-14T07:51:21Z</dc:date>
    </item>
    <item>
      <title>Re: DLT pipeline MLFlow UDF error</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-pipeline-mlflow-udf-error/m-p/51503#M29170</link>
      <description>&lt;P&gt;To be clear, the UDF registry file and the SQL pipeline code are 2 separate files of the DLT&lt;/P&gt;</description>
      <pubDate>Tue, 14 Nov 2023 07:52:36 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-pipeline-mlflow-udf-error/m-p/51503#M29170</guid>
      <dc:creator>Data_Interlaced</dc:creator>
      <dc:date>2023-11-14T07:52:36Z</dc:date>
    </item>
    <item>
      <title>Re: DLT pipeline MLFlow UDF error</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-pipeline-mlflow-udf-error/m-p/51516#M29172</link>
      <description>&lt;P&gt;Hi Kaniz,&lt;BR /&gt;&lt;BR /&gt;!pip show import&amp;nbsp;&lt;STRONG&gt;importlib_metadata&amp;nbsp;&lt;/STRONG&gt;shows the info of the module as expected.&amp;nbsp;&lt;BR /&gt;I can see that even during running a batch inference or structured streaming I get the same error.&lt;BR /&gt;&lt;BR /&gt;As soon as I use MLFlow on a non-ML runtime cluster the error shows. Maybe there is an incompatibility?&lt;/P&gt;</description>
      <pubDate>Tue, 14 Nov 2023 09:17:54 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-pipeline-mlflow-udf-error/m-p/51516#M29172</guid>
      <dc:creator>Data_Interlaced</dc:creator>
      <dc:date>2023-11-14T09:17:54Z</dc:date>
    </item>
    <item>
      <title>Re: DLT pipeline MLFlow UDF error</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-pipeline-mlflow-udf-error/m-p/51520#M29174</link>
      <description>&lt;P&gt;I can't use an ML runtime because I am using a Unity Catalog enabled cluster. This is a shared cluster capable of accessing the data I need. ML runitme on UC enabled clusters is not supported.&lt;BR /&gt;&lt;BR /&gt;During training of the model, I get the following warning though: Maybe this is related?&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;warnings.warn(
2023/11/14 08:59:54 WARNING mlflow.models.model: Logging model metadata to the tracking server has failed, possibly due older server version. The model artifacts have been logged successfully under dbfs:/databricks/mlflow-tracking/f5f6e63a202844138b5fad3fddd0007a/e570c1e2b3ab4cb9b220046a5e5c3a64/artifacts. In addition to exporting model artifacts, MLflow clients 1.7.0 and above attempt to record model metadata to the tracking store. If logging to a mlflow server via REST, consider upgrading the server version to MLflow 1.7.0 or above. Set logging level to DEBUG via `logging.getLogger("mlflow").setLevel(logging.DEBUG)` to see the full traceback.
/databricks/python/lib/python3.9/site-packages/sklearn/base.py:450: UserWarning: X does not have valid feature names, but IsolationForest was fitted with feature names&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 14 Nov 2023 09:34:57 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-pipeline-mlflow-udf-error/m-p/51520#M29174</guid>
      <dc:creator>Data_Interlaced</dc:creator>
      <dc:date>2023-11-14T09:34:57Z</dc:date>
    </item>
    <item>
      <title>Re: DLT pipeline MLFlow UDF error</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-pipeline-mlflow-udf-error/m-p/51966#M29366</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/93547"&gt;@Data_Interlaced&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I just encountered this issue as well. I compared the libraries installed in my non-ML and ML clusters with pip freeze and found version discrepancies between them.&amp;nbsp;&lt;/P&gt;&lt;P&gt;In my case, I resolved it by installing the library with version in the ML cluster. In particular, I needed the followings:&lt;/P&gt;&lt;LI-CODE lang="python"&gt;importlib-metadata==4.11.3
zipp==3.8.0&lt;/LI-CODE&gt;&lt;P&gt;Hope this help.&lt;/P&gt;</description>
      <pubDate>Wed, 15 Nov 2023 01:02:53 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-pipeline-mlflow-udf-error/m-p/51966#M29366</guid>
      <dc:creator>BarryC</dc:creator>
      <dc:date>2023-11-15T01:02:53Z</dc:date>
    </item>
    <item>
      <title>Re: DLT pipeline MLFlow UDF error</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-pipeline-mlflow-udf-error/m-p/51978#M29373</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/9"&gt;@Retired_mod&lt;/a&gt;&amp;nbsp;, when running the code on the dlt pipeline, we don't get to choose what cluster the dlt pipeline will use. The dlt pipeline just stands up a random temporary cluster that uses for the dlt run. Apologies if technically I am off but from a high level that's the way I see the dlt pipeline working.&amp;nbsp;&lt;/P&gt;&lt;P&gt;The only place I see where the dlt pipeline can be configured with additional libraries is within the dlt pipeline code itself by installing with the syntax as&lt;SPAN&gt;# MAGIC&amp;nbsp;&lt;/SPAN&gt;%pip install mlflow , etc... I've tried to resolve the error also by running the code with the added&amp;nbsp;&lt;SPAN&gt;# MAGIC %pip install importlib_metadata (as shown in my code on my post) but no luck.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;as the documentation states, installing of the mlflow library has to be done within the notebook of the dlt pipeline run.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Feather_0-1700022248426.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/4925i1B069037A5C06754/image-size/medium/is-moderation-mode/true?v=v2&amp;amp;px=400" role="button" title="Feather_0-1700022248426.png" alt="Feather_0-1700022248426.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 15 Nov 2023 04:26:01 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-pipeline-mlflow-udf-error/m-p/51978#M29373</guid>
      <dc:creator>Feather</dc:creator>
      <dc:date>2023-11-15T04:26:01Z</dc:date>
    </item>
    <item>
      <title>Re: DLT pipeline MLFlow UDF error</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-pipeline-mlflow-udf-error/m-p/51980#M29375</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/92295"&gt;@Feather&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Have you also tried specifying the version of the library as well?&lt;/P&gt;</description>
      <pubDate>Wed, 15 Nov 2023 05:21:58 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-pipeline-mlflow-udf-error/m-p/51980#M29375</guid>
      <dc:creator>BarryC</dc:creator>
      <dc:date>2023-11-15T05:21:58Z</dc:date>
    </item>
    <item>
      <title>Re: DLT pipeline MLFlow UDF error</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-pipeline-mlflow-udf-error/m-p/51981#M29376</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/42997"&gt;@BarryC&lt;/a&gt;&amp;nbsp;this has worked.&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;%pip install importlib-metadata==4.11.3
%pip instal zipp=3.8.0&lt;/LI-CODE&gt;&lt;P&gt;Adding this at the start of your DLT UDF register notebook will solve the issue.&lt;BR /&gt;&lt;BR /&gt;Databricks is advocating in all docs and tutorials to use DLT for ML inference, but this is a standard incompatibility inherent to the setup. I hope Databricks will take action and resolve this asap.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Kind regards,&lt;/P&gt;&lt;P&gt;Data Interlaced ltd&lt;BR /&gt;&lt;BR /&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 15 Nov 2023 07:33:11 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-pipeline-mlflow-udf-error/m-p/51981#M29376</guid>
      <dc:creator>Data_Interlaced</dc:creator>
      <dc:date>2023-11-15T07:33:11Z</dc:date>
    </item>
    <item>
      <title>Re: DLT pipeline MLFlow UDF error</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-pipeline-mlflow-udf-error/m-p/52089#M29393</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/42997"&gt;@BarryC&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;Your solution worked! I will vote your solution up.&lt;/P&gt;&lt;P&gt;Although I did get a new error (below) . I think this is an error of mine. I might have to open a new question for it.&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 509.0 failed 4 times, most recent failure: Lost task 0.3 in stage 509.0 (TID 830) (10.186.194.4 executor 0): org.apache.spark.SparkRuntimeException: [UDF_USER_CODE_ERROR.GENERIC] Execution of function udf(named_struct(close, close#46263, high, high#46264, low, low#46265, open, open#46266, symbol, symbol#46267, timestamp, timestamp#46268, tradingDay, tradingDay#46269, volume, volume#46270, yr, yr#46271, month, month#46272, dayy, dayy#46273, tradeTimestamp, tradeTimestamp#46274, ... 6 more fields)) failed. 
== Error ==
BrokenPipeError: [Errno 32] Broken pipe&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks!&lt;/P&gt;</description>
      <pubDate>Wed, 15 Nov 2023 16:38:14 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-pipeline-mlflow-udf-error/m-p/52089#M29393</guid>
      <dc:creator>Feather</dc:creator>
      <dc:date>2023-11-15T16:38:14Z</dc:date>
    </item>
    <item>
      <title>Re: DLT pipeline MLFlow UDF error</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-pipeline-mlflow-udf-error/m-p/54017#M29962</link>
      <description>&lt;P&gt;Were you able to fix this error? I am getting the same error.&lt;/P&gt;</description>
      <pubDate>Mon, 27 Nov 2023 18:53:54 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-pipeline-mlflow-udf-error/m-p/54017#M29962</guid>
      <dc:creator>jsc</dc:creator>
      <dc:date>2023-11-27T18:53:54Z</dc:date>
    </item>
  </channel>
</rss>

