Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
Error: batch scoring with mlflow.keras flavor model

New Contributor II

I am logging a trained keras model using the following: 

flavor= mlflow.keras,

And when I call the following:

predictions_df = fe.score_batch(model_uri=f"models:/{model_name}/{latest_model_version}", df=batch_input_df)

I get the following error:

OSError: [Errno 30] Read-only file system: '/local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-62b1a-a1cdf-baa5b-1/mlflow/models/tmpajr94lkz/raw_model/data/model.keras

I get the same 

I am essentially just trying to adapt this example to use a keras model instead of a Random Forest. The code runs fine with a Random Forest, and it runs fine if I just use the native mlflow log_model(), load_model() and predict() functions. However, I do get the error if I just use mlflow.pyfunc.load_model() to load the model and call model.predict(). This makes me think the bug is specific to the way in which the databricks FeatureEngineeringClient module is saving the keras model. 

I would appreciate any help with this issue.


Community Manager
Community Manager

Hi @MikiThe OSError: [Errno 30] Read-only file system typically occurs when you attempt to write to a directory that is read-only or does not exist.

Let’s explore some possible solutions:

  1. Check the Path:

    • Ensure that the path you’ve provided for saving the Keras model is correct and points to a writable directory. Double-check the directory structure and permissions.
    • If you’re using a relative path, make sure it’s relative to the correct working directory.
  2. Absolute Path:

    • Instead of using a relative path, consider using an absolute path. Absolute paths start from the root directory and are less prone to errors related to working directories.
    • For example, use /tmp/ instead of tmp/
  3. Temporary Directory:

    • If you’re saving temporary files, consider using a temporary directory (such as /tmp on Unix-like systems) that is writable.
    • You can create a temporary directory within your code and use it for saving the model.
  4. Permissions:

    • Ensure that the user running the code has the necessary permissions to write to the specified directory.
    • If you’re running the code in a restricted environment (such as AWS Lambda), be aware of the read-only file system limitations.
  5. Keras Model Saving:

    • When saving a Keras model, make sure you’re using the appropriate method. For example:
      • Use to save the entire model.
      • Use model.save_weights(filepath) to save only the model weights.
    • Verify that the model is being saved correctly.

New Contributor II

Hi Kaniz,

Thanks for the response. Apologies if I am missing something, but since I am directly using the databricks FeatureEngineeringClient.log_model()  method, I am not given the option to specify the path to write the model to. The only parameter I am given the option to provide is the artifact path and the model name, neither of which give me enough control to implement the solutions you are suggesting. I could potentially define a custom pyfunc rather than using the existing mlflow.keras flavor and then define my own save_model() and load_model() functions. However, I am struggling to see why this error is happening only when I am using the FeatureEngineeringClient() to log and load my model, while this all works fine when I use the mlflow logging and loading (although this prevents me from leveraging the automatic feature lookups provided by the feature store). 

Am I missing something? 

New Contributor II

I figured it out based on this issue that someone posted. By switching to a single-node cluster (meaning worker node permissions are irrelevant), this code works now. 

