cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Logging signature slows down inference to a crawl

Miki
New Contributor II

I am having a similar issue thislog signature and input data for Spark LinearRegression using mlflow v2.13.0 and using mlflow.pyfunc.log_model to log my model. Starting a new post here since there doesn't seem to be any follow up from the community on that. 

While I am able to save a signature with array types, running inference on a model with the signature logged is more than 100x slower, which is not acceptable for my use case. Skipping logging the array columns as the original answer suggests in the linked post does not work, since this throws a key error at inference time on the skipped columns. Without logging the signature, I cannot register the model in the unity catalog. However, I know there must be a way around this because if you use Databrick's FeatureEngineeringClient's log model function here, it has no problem registering the model and running inference in a reasonable amount of time, and based on the logged model schema, it does seem to be skipping these array columns somehow. However, I cannot use the FeatureEngineeringClient's log model function because this doesn't allow me to pass in a custom loss function. Any advice here would be appreciated.

2 REPLIES 2

MohsenJ
Contributor

@Miki can you please share you code for logging the signature with array types

Miki
New Contributor II

Sure you can reference the commented out code here which builds the signature from a pyspark dataframe with array types. This signature would then be passed in on L275 (also commented out). Please let me know if you get this  working without making inference unbearably slow. 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group