Hi @rasgaard, one way to achieve that without inspecting the container is to use MLflow artifacts. Artifacts allow you to log files together with your models and reference them inside the endpoint.
For example, let's assume that you need to include a YAML config file that controls the preprocessing of your model's inputs in the endpoint. In your training script you have:
artifacts = {
"model_path": model_path, # Path to the serialised model file
"pipeline_config_path": pipeline_config_path, # Path to the preprocessing config file
}
model_info = mlflow.pyfunc.log_model(
artifact_path="model",
python_model=ModelWrapper(),
artifacts=artifacts,
code_path=[<your-code-path>],
... # Additional arguments to model logging
)
And in the ModelWrapper class:
import mlflow
from mlflow.pyfunc import PythonModel
from your_code import preprocess_inputs
class ModelWrapper(PythonModel):
"""Wrapper around the model class.
It allows the custom model to be registered as a customised MLflow models with the
“python_function” (“pyfunc”) flavor, leveraging custom inference logic and artifact
dependencies.
"""
def __init__(self) -> None:
"""Initialise the wrapper."""
self.model = None
self.pipeline_config = None
def load_context(self, context: mlflow.pyfunc.PythonModelContext) -> None:
"""Load the model from the context.
Args:
context (PythonModelContext): Instance containing artifacts that the model
can use to perform inference.
"""
from joblib import load
self.model = load(context.artifacts["model_path"])
pipeline_config_path = context.artifacts.get("pipeline_config_path")
with open(pipeline_config_path) as f:
self.pipeline_config = yaml.safe_load(f) # We assume that pipeline_config is a dict
def predict(self, context: mlflow.pyfunc.PythonModelContext, model_input):
"""Make predictions using the wrapper.
Args:
context (PythonModelContext): Instance containing artifacts that the model
can use to perform inference.
model_input: Model inputs for which to generate predictions.
Returns:
The predictions of the estimator on the inputs.
"""
inputs = preprocess_inputs(model_inputs, **self.pipeline_config)
return self.model.predict(inputs)
You can find more info here: https://mlflow.org/docs/latest/python_api/mlflow.pyfunc.html#creating-custom-pyfunc-models
Hopefully this is of help, let me know!