Re: Scaling issue for inference with a spark.mllib... - Databricks Community

Thanks for the help.

skew : I did not focus on this point, i'll have a look
spark.sql.shuffle.partitions : already done
bigger driver : already done
there is indeed an udf used for the prediction score retrieval, i will have a look too

Update :

skew : there was a large one but problem remained after the fix
udf : only a model udf exists which should already be optimized