Databricks

bhawik21 · ‎02-11-2022

I have used mlflow and got my model served through REST API. It work fine when all model features are provided. But my use case is that only a single feature (the primary key) will be provided by the consumer application, and my code has to lookup the other features from a database based on that key and then use the model.predict to return the prediction. I tried researching but understood that the REST endpoints will simply invoke the model.predict function. How can I make it invoke a data massaging function before predicting?

-werners- · ‎02-11-2022

you want a machine learning pipeline, part of so called MLOps.

You can find a lot online, like here f.e.

View solution in original post

-werners- · ‎02-11-2022

you want a machine learning pipeline, part of so called MLOps.

You can find a lot online, like here f.e.

bhawik21 · ‎06-23-2022

Thanks @Werner Stinckens , while pipeline can condense data prep into an abstraction, the usecase in question was about invocation of a data massaging function within the model's predict function.

Now that I solved it, I like to share with the community. I went about designing the solution by wrapping the model's predict function with a wrapper class's predict method. Since this is non-standard model, I used the pyfunc flavor to log the model in mlflow. By specifying the conda environment, the model hosting cluster can install those libraries and this then runs our custom model.

The main design aspect here is that we replace the model's native predict function with a user defined one. This can do anything like reading, preparing data and then go on to call the model's native predict method.

Databricks has a newish functionality called Feature Store that can handle this kind of usecases. It has APIs that natively invoke a Feature Table lookup and run predict on the values fetched.

-werners- · ‎06-26-2022

nice!

LuisL · ‎01-24-2023

You can create a custom endpoint for your REST API that handles the data massaging before calling the

model.predict function. This endpoint can take in the primary key as an input, retrieve the additional features from the database based on that key, and then pass the complete set of features to the

model.predict function.

You can use a web framework like Flask or FastAPI to create the custom endpoint. For example, you can create a function that retrieves the additional features like lead enrichment from the database and calls the model.predict

function, and then use this function as a route in your Flask or FastAPI app. The client application can then send a request to this custom endpoint with the primary key, and the endpoint will return the prediction based on the retrieved features.

You can also use mlflow's Model.call() method to invoke your custom function.

You can also use Serverless Framework or other similar tools to deploy this function and expose it through an API Gateway

Databricks

How do I invoke a data enrichment function before model.predict while serving the model

Registration now open! Databricks Data + AI Summit 2024

Meet DBRX, the New Standard for High-Quality LLMs

Data Warehousing in the Era of AI