cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
cancel
Showing results for 
Search instead for 
Did you mean: 

Create Serving Endpoint with JAVA Runtime

prafull
New Contributor

Hello,

Trying to create a custom serving endpoint, using artifacts argument while logging the run/model to save .jar files. These files are called during when calling .predict. 

JAVA runtime 8 or higher is required to run the jar file, not sure how to create a serving endpoint that will have JAVA runtime.

# Model wrapper class
#this model will give prob of all classes, prediction, predictio prob
class ModelWrapper_custom_DataRobot_To_Linesense(mlflow.pyfunc.PythonModel😞
    # Initialize model in the constructor
    def __init__(self, model😞
        self.model = model

    # Prediction function
    def predict(self, context, model_input😞

        model = ScoringCodeModel(context.artifacts['model_jar'])

        model_output = model.predict(model_input)
   
        df_temp = pd.DataFrame()

        #Extract Probability values for selected prediction
        df_temp['prediction_probability'] = model_output.max(axis=1)

        # Find the column with maximum probability for each row
        df_temp['prediction'] = model_output.idxmax(axis=1)

        #remove NAN, this is required to avoid error in max find
        df_temp = df_temp.dropna()

        # Extract the middle value from the 'prediction' column
        df_temp['prediction'] = df_temp['prediction'].apply(lambda x: x.split('_')[1])

        #return
        return df_temp.to_json(orient='records')

This is a simplified version of model wrapper, when serving endpoint is deployed it cannot infer due to java runtime missing.

 

1 REPLY 1

Kaniz
Community Manager
Community Manager

Hi @prafull,

Creating a custom serving endpoint with the necessary JAVA runtime is essential for deploying your model.

Let’s explore some steps to achieve this:

  1. Databricks Model Serving:

    • Databricks provides options for creating and managing model-serving endpoints. You can use the Serving UI, REST API, or the MLflow Deployments SDK.
    • The Serving UI allows you to create endpoints interactively, while the REST API enables programmatic creation.
    • Ensure that your workspace is in a supported region before proceeding.
  2. Access Control:

  3. Create an Endpoint using the UI:

    • In the Databricks UI, navigate to Serving in the sidebar.
    • Click Create serving endpoint.
    • For models registered in the Workspace model registry or Unity Catalog:
      • Provide a name for your endpoint.
      • Select the type of model you want to serve (e.g., scikit-learn, XGBoost, PyTorch).
      • Choose the model and version to serve.
      • Specify the percentage of traffic to route to your model.
      • Select the compute type (CPU or GPU).
      • Set the compute scale-out based on the number of requests your model can handle simultaneously.
  4. Custom Model Wrapper:

    • Your simplified model wrapper class is a good starting point.
    • Ensure that the context.artifacts['model_jar'] points to the correct .jar file containing your model.
    • The ScoringCodeModel class should be defined to handle the model inference using the .jar file.
    • Extract predictions and probabilities as you’ve done in your code.
  5. JAVA Runtime:

    • To ensure the endpoint has the required JAVA runtime, consider the following:
      • Packaging: Package your model and dependencies (including the .jar file) appropriately.
      • Environment: Deploy the endpoint in an environment with JAVA 8 or higher.
      • Testing: Test the endpoint to verify that it can successfully run the .jar file.

Remember that this is a high-level overview, and you may need to adapt it to your specific use case. Good luck with deploying your custom serving endpoint! 🚀