- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-31-2025 08:57 AM
Thanks for sharing the code and the context. Here are the core issues I see and how to fix them so MLflow logging works reliably on Databricks.
What’s breaking MLflow logging in your code
-
Your PyFunc wrapper loads the AutoGluon model from a local path rather than from the MLflow model’s packaged artifacts. In
PythonModel.load_context, you must read any files fromcontext.artifacts[...]. Otherwise, loading or serving the model will fail when that local path doesn’t exist in the target environment. -
The
input_exampleand signature inference are misaligned. You passself.X_train[:2], butself.X_trainis never defined; alsoinput_examplemust match the schema you infer withinfer_signature(model_input=..., model_output=...). Use a small slice oftrain_features(DataFrame with target dropped) for both signature and example. -
classification_reportarguments are incorrect. It expectsy_trueandy_pred(discrete labels), but you passXasy_trueand rounded probabilities asy_pred. Passself.val_data[self.target_col]and(self.val_predictions > 0.5).astype(int)(or a tuned threshold) instead. -
brier_score_lossexpects probabilities, not thresholded predictions. Use the raw positive-class probabilitiesy_pred_proba(shape(n_samples,)) for Brier, not(y_pred > 0.5). If you need 0–1 range, setscale_by_half=True(binary default is usually auto). -
evaluate_modeluses undefined attributes (self.X_train,self.y_true). Use your stored train/validation splits and compute AUC withroc_auc_score(y_true, y_score)wherey_scoreare positive-class probabilities. -
The AutoGluon
pathpointing to/Shared/...is a workspace path, not a filesystem location. Use a real local/temp directory (for example viatempfile.mkdtemp()), then package it into MLflow model artifacts withartifacts={"ag_predictor": <local_dir>}and load withcontext.artifacts[...]in your PyFunc. -
Make sure to set the MLflow experiment to a workspace path (like
/Shared/...), which is supported on Databricks; if you want artifacts stored in UC Volumes, create the experiment with a UC volume artifact location. -
Finally, ensure runtime dependencies (AutoGluon + its model backends, e.g., LightGBM, XGBoost, CatBoost) are present when loading/serving the model. Use
conda_envorextra_pip_requirementsinmlflow.pyfunc.log_modelso MLflow reproduces the environment cleanly.
Here are some code patches:
1) Fix PyFunc wrapper to read from packages artifacts:
import mlflow
import pandas as pd
from mlflow.pyfunc import PythonModel
from autogluon.tabular import TabularPredictor
class AutoGluonPyFuncWrapper(PythonModel):
"""Wrapper for AutoGluon model to be logged as a PyFunc model in MLflow."""
def __init__(self):
self.predictor = None
def load_context(self, context):
# Load the predictor directory that was logged as an artifact
predictor_dir = context.artifacts["ag_predictor"]
self.predictor = TabularPredictor.load(predictor_dir)
def predict(self, context, model_input):
# Accept dict/list; convert to DataFrame
if not isinstance(model_input, pd.DataFrame):
model_input = pd.DataFrame(model_input)
# Probability of the positive class
proba_df = self.predictor.predict_proba(model_input)
# Choose positive label robustly (prefer 1 if present)
class_labels = list(proba_df.columns)
pos_label = 1 if 1 in class_labels else class_labels[-1]
return proba_df[pos_label] # Pandas Series of positive-class probabilities
2) Log AutoGluon predictor directory as an MLflow artifact and align the signature
import tempfile
import mlflow
from mlflow.models.signature import infer_signature
# Choose a real local directory for AutoGluon training output
local_model_dir = tempfile.mkdtemp(prefix="ag_predictor_")
with mlflow.start_run() as run:
# Train AutoGluon
self.predictor = TabularPredictor(
problem_type="binary",
label=self.target_col,
eval_metric="roc_auc",
path=local_model_dir
).fit(
self.train_data,
excluded_model_types=["KNN", "RF"],
hyperparameters=hyperparameters,
presets="best_quality",
num_bag_folds=3,
num_stack_levels=1,
time_limit=time_limit,
verbosity=1,
num_cpus=4,
num_gpus=0,
ag_args_fit={"num_cpus": 1, "num_gpus": 0}
)
# Compute train/val probabilities for metrics
train_X = self.train_data.drop(columns=[self.target_col])
val_X = self.val_data.drop(columns=[self.target_col])
self.train_predictions = self.predictor.predict_proba(train_X).iloc[:, -1]
self.val_predictions = self.predictor.predict_proba(val_X).iloc[:, -1]
# Metrics (see patch 3 below)
self.compute_metrics(self.train_data[self.target_col], self.train_predictions, "train")
self.compute_metrics(self.val_data[self.target_col], self.val_predictions, "validation")
# Signature and input_example must match the wrapper’s input/output
input_example = train_X.head(2)
signature = infer_signature(model_input=input_example, model_output=self.train_predictions.head(2))
# Log PyFunc model and the trained predictor directory as artifact
mlflow.pyfunc.log_model(
artifact_path="model",
python_model=AutoGluonPyFuncWrapper(),
artifacts={"ag_predictor": local_model_dir},
signature=signature,
input_example=input_example,
# Strongly recommended: pin pip requirements to include AutoGluon & backends
extra_pip_requirements=[
"mlflow>=2.8.0", # adjust to your workspace runtime
"autogluon.tabular>=1.1.0", # pin your version
"xgboost>=1.7.0",
"lightgbm>=3.3.5",
"catboost>=1.2"
],
)
self.run_id = run.info.run_id
3) Correct your metric logging
from sklearn.metrics import (
roc_auc_score,
average_precision_score,
f1_score,
fbeta_score,
brier_score_loss,
recall_score,
precision_score,
classification_report
)
def compute_metrics(self, y_true, y_pred_proba, prefix):
# y_pred_proba: probabilities of positive class
y_pred_bin = (y_pred_proba > 0.5).astype(int)
metrics = {
f"{prefix}_auc": roc_auc_score(y_true, y_pred_proba),
f"{prefix}_average_precision": average_precision_score(y_true, y_pred_proba),
f"{prefix}_f1_score": f1_score(y_true, y_pred_bin),
f"{prefix}_f2_score": fbeta_score(y_true, y_pred_bin, beta=2.0),
f"{prefix}_brier_score": brier_score_loss(y_true, y_pred_proba),
f"{prefix}_recall": recall_score(y_true, y_pred_bin),
f"{prefix}_precision": precision_score(y_true, y_pred_bin),
}
for k, v in metrics.items():
mlflow.log_metric(k, float(v))
return metrics
def log_classification_report(self):
# Use validation set labels and thresholded predictions
y_true = self.val_data[self.target_col]
y_pred_bin = (self.val_predictions > 0.5).astype(int)
report = classification_report(y_true, y_pred_bin, output_dict=True)
mlflow.log_dict(report, "classification_report.json")
4) Fix evaluate_model to use your stored splits
def evaluate_model(self):
# Use the validation set probabilities already computed
auc_score = roc_auc_score(self.val_data[self.target_col], self.val_predictions)
print(f"Model AUC (validation): {auc_score:.4f}")
return auc_score
A couple of Databricks-specific practices to keep this robust
-
Set the workspace experiment path once (recommended):
mlflow.set_experiment(f"/Shared/automl_experiments/{self.experiment_name}"). If you want to store artifacts in UC Volumes, create the experiment with an artifact location at a UC Volume path first, then set it active by path. -
Package all runtime deps with the model (pip/conda), especially AutoGluon and its tree learners. You can use
extra_pip_requirements(shown above) or supply aconda_envdict if you prefer hard pinning Python and Conda channels. -
Always load files via
context.artifacts[...]inload_context. MLflow will download artifacts next to the model and pass you local paths at runtime; don’t assume workspace or DBFS paths exist when the model is rehydrated. -
Align
input_examplewith your signature and wrapper input type (DataFrame rows of features). Signature/input_exampleimproves handoff, validation, and serving.