Hi all, I've deployed a model, moved it to production and served it (mlflow), but when testing it in the python notebook I get a 400 error. code/details below:
import os
import requests
import json
import pandas as pd
import numpy as np
# Create two records for testing the prediction
test_input1 = {"OriginAirportCode":"SAT","Month":5,"DayofMonth":5,"CRSDepHour":13,"DayOfWeek":7,"Carrier":"MQ","DestAirportCode":"ORD","WindSpeed":9,"SeaLevelPressure":30.03,"HourlyPrecip":0}
test_input2 = {"OriginAirportCode":"ATL","Month":2,"DayofMonth":5,"CRSDepHour":8,"DayOfWeek":4,"Carrier":"MQ","DestAirportCode":"MCO","WindSpeed":3,"SeaLevelPressure":31.03,"HourlyPrecip":0}
# package the inputs into a JSON string and test run() in local notebook
inputs = pd.DataFrame([test_input1, test_input2])
print(inputs)
def create_tf_serving_json(data):
return {'inputs': {name: data[name].tolist() for name in data.keys()} if isinstance(data, dict) else data.tolist()}
def score_model(dataset):
url = 'https://adb-<obfuscated>.azuredatabricks.net/model/Delay%20Estimator/Production/invocations' # Enter your URL here
personal_access_token = 'dapi2<obfuscated>853-2' # Enter your Personal Access Token here
headers = {'Authorization': f'Bearer {personal_access_token}'}
data_json = dataset.to_dict(orient='split') if isinstance(dataset, pd.DataFrame) else create_tf_serving_json(dataset)
response = requests.request(method='POST', headers=headers, url=url, json=data_json)
if response.status_code != 200:
raise Exception(f'Request failed with status {response.status_code}, {response.text}')
return response.json()
score_model(inputs)
CMD ERROR OUTPUT:
Exception: Request failed with status 400, {"error_code": "BAD_REQUEST", "message": "The input must be a JSON dictionary with exactly one of the input fields {'dataframe_split', 'instances', 'inputs', 'dataframe_records'}. Received dictionary with input fields: ['index', 'columns', 'data']. IMPORTANT: The MLflow Model scoring protocol has changed in MLflow version 2.0. If you are seeing this error, you are likely using an outdated scoring request format. To resolve the error, either update your request format or adjust your MLflow Model's requirements file to specify an older version of MLflow (for example, change the 'mlflow' requirement specifier to 'mlflow==1.30.0'). If you are making a request using the MLflow client (e.g. via `mlflow.pyfunc.spark_udf()`), upgrade your MLflow client to a version >= 2.0 in order to use the new request format. For more information about the updated MLflow Model scoring protocol in MLflow 2.0, see https://mlflow.org/docs/latest/models.html#deploy-mlflow-models."}
I've tested with earlier library versions with no joy, and also tried different ways to code it. Does anyone have any idea what the issue is or a better way to call it?
I'll be calling the same serving endpoint from a web app once this is working.