Error: batch scoring with mlflow.keras flavor model
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-15-2024 06:51 AM - edited 03-15-2024 07:10 AM
I am logging a trained keras model using the following:
fe.log_model(
model=model,
artifact_path="wine_quality_prediction",
flavor= mlflow.keras,
training_set=training_set,
registered_model_name=model_name
)
And when I call the following:
predictions_df = fe.score_batch(model_uri=f"models:/{model_name}/{latest_model_version}", df=batch_input_df)
display(predictions_df)
I get the following error:
OSError: [Errno 30] Read-only file system: '/local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-62b1a-a1cdf-baa5b-1/mlflow/models/tmpajr94lkz/raw_model/data/model.keras
I get the same
I am essentially just trying to adapt this example to use a keras model instead of a Random Forest. The code runs fine with a Random Forest, and it runs fine if I just use the native mlflow log_model(), load_model() and predict() functions. However, I do get the error if I just use mlflow.pyfunc.load_model() to load the model and call model.predict(). This makes me think the bug is specific to the way in which the databricks FeatureEngineeringClient module is saving the keras model.
I would appreciate any help with this issue.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-18-2024 07:04 AM
Hi Kaniz,
Thanks for the response. Apologies if I am missing something, but since I am directly using the databricks FeatureEngineeringClient.log_model() method, I am not given the option to specify the path to write the model to. The only parameter I am given the option to provide is the artifact path and the model name, neither of which give me enough control to implement the solutions you are suggesting. I could potentially define a custom pyfunc rather than using the existing mlflow.keras flavor and then define my own save_model() and load_model() functions. However, I am struggling to see why this error is happening only when I am using the FeatureEngineeringClient() to log and load my model, while this all works fine when I use the mlflow logging and loading (although this prevents me from leveraging the automatic feature lookups provided by the feature store).
Am I missing something?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-18-2024 08:48 AM
I figured it out based on this issue that someone posted. By switching to a single-node cluster (meaning worker node permissions are irrelevant), this code works now.
![](/skins/images/8C2A30E5B696B676846234E4B14F2C7B/responsive_peak/images/icon_anonymous_message.png)
![](/skins/images/8C2A30E5B696B676846234E4B14F2C7B/responsive_peak/images/icon_anonymous_message.png)