cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Model Serving Endpoints - Build configuration and Interactive access

rasgaard
New Contributor

Hi there 🙂

I have used the Databricks Model Serving Endpoints to serve a model which depends on some config files and a custom library. The library has been included by logging the model with the `code_path` argument in `mlflow.pyfunc.log_model` and it works perfectly fine. I wanted to do the same with the config files but I couldn't make out where exactly the MLflow model was copied to on the Model Serving Endpoint build containing the MLflow model.

After a bit of debugging I figured out that for local builds of the MLflow model using `mlflow models build-docker` the model files are copied to `/opt/ml/model/` where I imagined that Model Serving Endpoints also used that command under the hood. I was wrong in that assumption as the model files were saved in `/model/` on the Serving Endpoints build.

My question is then finally, how and/or where do I get insights into the build process of the Model Serving Endpoints builds? This placement of the files seems to be part of a custom Dockerfile that I can't seem to find any specification or documentation for. It would also be amazing if it was possible to have interactive access to the container that hosts the Serving Endpoint as that would have made debugging a whole lot easier.

 

Thanks in advance 🙂

1 REPLY 1

robbe
New Contributor III

Hi @rasgaard, one way to achieve that without inspecting the container is to use MLflow artifacts. Artifacts allow you to log files together with your models and reference them inside the endpoint.

For example, let's assume that you need to include a YAML config file that controls the preprocessing of your model's inputs in the endpoint. In your training script you have:

artifacts = {
    "model_path": model_path,  # Path to the serialised model file
    "pipeline_config_path": pipeline_config_path,  # Path to the preprocessing config file
}

model_info = mlflow.pyfunc.log_model(
    artifact_path="model",
    python_model=ModelWrapper(),
    artifacts=artifacts,
    code_path=[<your-code-path>],
    ...  # Additional arguments to model logging
)

And in the ModelWrapper class:

import mlflow
from mlflow.pyfunc import PythonModel

from your_code import preprocess_inputs


class ModelWrapper(PythonModel):
    """Wrapper around the model class.

    It allows the custom model to be registered as a customised MLflow models with the
    “python_function” (“pyfunc”) flavor, leveraging custom inference logic and artifact
    dependencies.
    """

    def __init__(self) -> None:
        """Initialise the wrapper."""
        self.model = None
        self.pipeline_config = None

    def load_context(self, context: mlflow.pyfunc.PythonModelContext) -> None:
        """Load the model from the context.

        Args:
            context (PythonModelContext): Instance containing artifacts that the model
                can use to perform inference.
        """
        from joblib import load

        self.model = load(context.artifacts["model_path"])

        pipeline_config_path = context.artifacts.get("pipeline_config_path")
        with open(pipeline_config_path) as f:
            self.pipeline_config = yaml.safe_load(f)  # We assume that pipeline_config is a dict

    def predict(self, context: mlflow.pyfunc.PythonModelContext, model_input):
        """Make predictions using the wrapper.

        Args:
            context (PythonModelContext): Instance containing artifacts that the model
                can use to perform inference.
            model_input: Model inputs for which to generate predictions.

        Returns:
            The predictions of the estimator on the inputs.
        """
        inputs = preprocess_inputs(model_inputs, **self.pipeline_config)
        return self.model.predict(inputs)

You can find more info here: https://mlflow.org/docs/latest/python_api/mlflow.pyfunc.html#creating-custom-pyfunc-models

Hopefully this is of help, let me know!

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group