migq2
New Contributor III

Hi @Retired_mod, I'm using mlflow-skinny[databricks]==2.14.3 in a Databricks cluster with DBR 13.3 LTS.

I have tried training a model with the following libraries:

  • Spark MLlib: does not log any signature at all (you can find the snippet to reproduce here)
  • SynapseML LightGBM: logs a input signature but not an output
  • scikit-learn: logs a signature with both input and output. However the output signature seems to be a Tensor based signature, which I thought was meant for Deep Learning use cases even though my example is a simple iris dataset regression model

    Here goes the sklearn example:

 

 

 

import mlflow
from sklearn import datasets
from sklearn.ensemble import RandomForestClassifier

print(f"MLFLOW version is: {mlflow.__version__}\n")

mlflow.autolog(exclusive=False)

with mlflow.start_run():
    # Train a sklearn model on the iris dataset
    X, y = datasets.load_iris(return_X_y=True, as_frame=True)
    clf = RandomForestClassifier(max_depth=7)
    clf.fit(X, y)
    
    model_info = mlflow.models.get_model_info(f"runs:/{mlflow.active_run().info.run_id}/model")
    
    print("Model signature:")
    print(model_info.signature)

 

 

 

Output: 

 

 

 

MLFLOW version is: 2.14.3

Model signature:
inputs: 
  ['sepal length (cm)': double (required), 'sepal width (cm)': double (required), 'petal length (cm)': double (required), 'petal width (cm)': double (required)]
outputs: 
  [Tensor('int64', (-1,))]
params: 
  None