I deployed a simple Iris model in Databricks Model Serving and exposed it as an endpoint. I’m trying to query the endpoint using a service principal. I can successfully fetch the access token with the following databricks_token() function:
def databricks_token(): 
    token_url = f"https://accounts.cloud.databricks.com/oidc/accounts/{MY_ACCOUNT_ID}/v1/token"
    scope = "all-apis"
    data = {
        "grant_type": "client_credentials",
        "client_id": CLIENT_ID,
        "client_secret": CLIENT_SECRET,
        "scope": scope,
    }
    response = requests.post(token_url, data=data)
    token_data = response.json()
    access_token = token_data["access_token"]
    return access_token
Then I try to query the endpoint using score_model():
def score_model():
    url = f"https://{WORKSPACE_HOST}.cloud.databricks.com/serving-endpoints/{MODEL_SERVING_ENDPOINT_NAME}/invocations"
    headers = {'Authorization': f'Bearer {databricks_token()}', 'Content-Type': 'application/json'}
    data_json = json.dumps(data, allow_nan=True)
    response = requests.request(method='POST', headers=headers, url=url, data=data_json)
    if response.status_code != 200:
        raise Exception(f'Request failed with status {response.status_code}, {response.text}')
    return response.json()
print(score_model())
But the call fails with: Exception: Request failed with status 403, {"error_code":"403","message":"Unauthorized access to workspace: xxxxxxxxxx"}
In the Databricks UI, the serving endpoint already has the permission “All workspace users can query”.
What am I missing to allow a service principal to query the model serving endpoint? Do I need to assign additional workspace or service principal permissions beyond the endpoint-level access?
Note that the route optimization is not enabled here.