cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

MLFlow Serve Logging

BeardyMan
New Contributor III

When using Azure Databricks and serving a model, we have received requests to capture additional logging. In some instances, they would like to capture input and output or even some of the steps from a pipeline.

Is there any way we can extend the logging with a MLFlow rest endpoint to capture additional required information?

1 ACCEPTED SOLUTION

Accepted Solutions

ChenranLi
New Contributor III

Here is an example of a custom model based on the sklearn model "GradientBoostingClassifier":

class CustomizedGradientBoostingClassifier(sklearn.ensemble.GradientBoostingClassifier):
  def __init__(self, random_state):
    super().__init__(random_state=random_state)
  
  def fit(self, X, y):
    super().fit(X, y)
  
  def predict_proba(self, X_test):
    return super().predict_proba(X_test)
  
  def predict(self, X):
    # Do customized tasks here (e.g. issueing an RPC calll to log the input and output)
    
    # For example, you can also return not only the predicted result, but also the input
    return (super().predict(X), X)

You can register the model as usual. When you invoke the REST endpoint, it does some custom things in the predict() function, and returns not only the predicted result, but also the input.

View solution in original post

9 REPLIES 9

Kaniz
Community Manager
Community Manager

Hi @ BeardyMan! My name is Kaniz, and I'm the technical moderator here. Great to meet you, and thanks for your question! Let's see if your peers on the community have an answer to your question first. Or else I will follow up shortly with a response.

Dan_Z
Honored Contributor
Honored Contributor

To my knowledge, if you write a custom model's predict() function, you can do any arbitrary operations in it (log inputs or outputs somewhere).

BeardyMan
New Contributor III

Do you mean to use azure functions and custom python code to call the model and then perform the logging required rather than using the mlflow serve capability and the managed rest endpoint? ​

Dan_Z
Honored Contributor
Honored Contributor

My thought was:

  1. Create a custom model with a predict function that does extra work (like logging)
  2. Register the Model
  3. Run the model in Model Serving

BeardyMan
New Contributor III

Thank you for the clarification, I understand what you mean now and that's exactly what I was hoping for! 🙂

ChenranLi
New Contributor III

Here is an example of a custom model based on the sklearn model "GradientBoostingClassifier":

class CustomizedGradientBoostingClassifier(sklearn.ensemble.GradientBoostingClassifier):
  def __init__(self, random_state):
    super().__init__(random_state=random_state)
  
  def fit(self, X, y):
    super().fit(X, y)
  
  def predict_proba(self, X_test):
    return super().predict_proba(X_test)
  
  def predict(self, X):
    # Do customized tasks here (e.g. issueing an RPC calll to log the input and output)
    
    # For example, you can also return not only the predicted result, but also the input
    return (super().predict(X), X)

You can register the model as usual. When you invoke the REST endpoint, it does some custom things in the predict() function, and returns not only the predicted result, but also the input.

BeardyMan
New Contributor III

Thank you @Chenran Li​  the example is exceedingly helpful. I will be sure to try this out!

Dan_Z
Honored Contributor
Honored Contributor

Another word from a Databricks employee:

"""

You can use the custom model approach but configuring it is painful. Plus you have ended every loggable model in the custom model. Another less intrusive solution would be to have a proxy server do the logging and then defer to MLflow model server. See very basic POC: https://github.com/amesar/mlflow-model-monitoring

Also check out Seldon Alibi for advanced monitoring.

""

BeardyMan
New Contributor III

Thank you, Dan. We had originally suggested the route of using azure api manager or using an azure function as like an api wrapper to do the logging we want and the forwarding on the call to the mlfmow model serve rest endpoint. I was just wondering if there was a better alternative or something obvious we were missing. ​

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.