cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Unable to call logged ML model from a different notebook when using Spark ML

DK
New Contributor II

Hi, I am a R user and I am experimenting to build an ml model with R and with spark flavoured algorithms in Databricks. However, I am struggling to call a model that is logged as part of the experiment from a different notebook when I use spark flavoured algorithms. This intro may not make sense but please bear with me and I will explain below with codes the issue I am having (I have used community Databricks edition). Same result with my office licensed Databricks service.

Platform: Databricks community edition

Cluster specs: 11.2 ML (includes Apache Spark 3.3.0, Scala 2.12) 

Language: R

In a notebook 1:

# i used the built-in data so it can be replicated
install.packages("mlflow")
library(mlflow)
install_mlflow()
install.packages("carrier")
library(sparklyr)
 
 
sc <- sparklyr::spark_connect(method = "databricks")
 
# convert to spark table
iris_tbl <- sparklyr::sdf_copy_to(sc, iris, "iris", overwrite = TRUE)
 
# build a model with kmeans based on sparklyr lib and use MLflow to log
with(mlflow_start_run(),{
 kmeans_model <- sparklyr::ml_kmeans(iris_tbl, k = 3, features = c("Petal_Length", "Petal_Width"))
  
 predicted <- carrier::crate(~sparklyr::ml_predict(!!kmeans_model, .x))
  
 mlflow_log_model(predicted, "model")
})
 
 
# call the logged model from the experiment artifact
logged_model = 'runs:/996bc4f0ad2a4681a4acf42515ee73d5/model'
loaded_model = mlflow_load_model(logged_model)
 
# predict using the loaded model
loaded_model(iris_tbl)
 
# predicts perfectly
(2) Spark Jobs
# Source: spark<?> [?? x 7]
  Sepal_Length Sepal_Width Petal_Length Petal_Width Species features predictโ€ฆยน
     <dbl>    <dbl>    <dbl>    <dbl> <chr>  <list>    <int>
 1     5.1     3.5     1.4     0.2 setosa <dbl [2]>     1
 2     4.9     3      1.4     0.2 setosa <dbl [2]>     1
 3     4.7     3.2     1.3     0.2 setosa <dbl [2]>     1
 4     4.6     3.1     1.5     0.2 setosa <dbl [2]>     1
 5     5      3.6     1.4     0.2 setosa <dbl [2]>     1
 6     5.4     3.9     1.7     0.4 setosa <dbl [2]>     1
 7     4.6     3.4     1.4     0.3 setosa <dbl [2]>     1
 8     5      3.4     1.5     0.2 setosa <dbl [2]>     1
 9     4.4     2.9     1.4     0.2 setosa <dbl [2]>     1
10     4.9     3.1     1.5     0.1 setosa <dbl [2]>     1
# โ€ฆ with more rows, and abbreviated variable name ยนโ€‹prediction
# โ„น Use `print(n = ...)` to see more rows

However, in Notebook 2, when I try to predict with the same loaded model it throws error - which I can't make any sense of

# install and load library 
install.packages("mlflow")
library(mlflow)
install_mlflow()
 
install.packages("carrier")
library(sparklyr)
 
 
sc <- sparklyr::spark_connect(method = "databricks")
iris_tbl <- sparklyr::sdf_copy_to(sc, iris, "iris", overwrite = TRUE)
 
# call the logged model and load it
logged_model = 'runs:/996bc4f0ad2a4681a4acf42515ee73d5/model'
loaded_model = mlflow_load_model(logged_model)
 
# predict
loaded_model(iris_tbl)
 
# However following error pops-up
 
 
Error : java.lang.IllegalArgumentException: Object not found 171
	at sparklyr.StreamHandler.handleMethodCall(stream.scala:115)
	at sparklyr.StreamHandler.read(stream.scala:62)
	at sparklyr.BackendHandler.$anonfun$channelRead0$1(handler.scala:60)
	at scala.util.control.Breaks.breakable(Breaks.scala:42)
	at sparklyr.BackendHandler.channelRead0(handler.scala:41)
	at sparklyr.BackendHandler.channelRead0(handler.scala:14)
	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:327)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:299)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:722)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:658)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:584)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.lang.Thread.run(Thread.java:748)
 
Error: java.lang.IllegalArgumentException: Object not found 171

Can you please help me to resolve this issue. Thanks

 

1 REPLY 1

Anonymous
Not applicable

@Dip Kunduโ€‹ :

It seems like the error you are facing is related to sparklyr, which is used to interact with Apache Spark from R, and not directly related to mlflow. The error message suggests that an object could not be found, but it's not clear which object it is.

It's possible that the error is caused by the fact that you are using loaded_model as a function, but it's not a function. When you load a model with mlflow_load_model(), you get an R object that represents the loaded model, and you can use this object to make predictions. However, you cannot call this object as a function directly.

Instead, you should use the appropriate method for making predictions with the loaded model. The method will depend on the type of model that you have loaded. For example, if you have loaded a Spark MLlib model, you can use the ml_predict() method from sparklyr to make predictions. If you have loaded an R model, you can use the appropriate predict() method for that model.

Here is an example of how you can use ml_predict() to make predictions with a KMeans model loaded from mlflow:

# load the model
logged_model <- 'runs:/996bc4f0ad2a4681a4acf42515ee73d5/model'
loaded_model <- mlflow_load_model(logged_model)
 
# create a spark DataFrame with the data you want to predict on
data_to_predict <- sdf_copy_to(sc, iris, "data_to_predict", overwrite = TRUE)
 
# use ml_predict() to make predictions
predictions <- ml_predict(loaded_model, data_to_predict)

Note that you need to provide a spark DataFrame as the second argument to ml_predict(). In this example, I'm creating a new spark DataFrame with the data you want to predict on.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group