cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

How far does model size and lag impact distributed inference ?

anvil
New Contributor II

Hello !

I was wondering how impactful a model's size of inference lag was in a distributed manner.

With tools like Pandas Iterator UDFs or mlflow.pyfunc.spark_udf() we can make it so models are loaded only once per worker, so I would tend to say that minimizing inference lag is more important than minimizing size, since size will impact us once per model whereas lag will impact us once per observation.

I would also say that the impact is even greater with ensemble models where several models - with their own lag - each need to infer once per observation.

Is this assumption correct ?

Thank you !

1 REPLY 1

youssefmrini
Honored Contributor III
Honored Contributor III

Your assumption that minimizing inference lag is more important than minimizing the size of the model in a distributed setting is generally correct.

In a distributed environment, models are typically loaded once per worker, as you mentioned, which means that the impact of model size is limited to the initial loading of the model. However, inference lag occurs every time an observation is processed, which can have a significant impact on the overall performance of the system.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group