Databricks Community

jAAmes_bentley · ‎03-12-2025

As of sometime between March 7th and March 12, the AI_QUERY function has become very temperamental with Azure OpenAI models.

Asking a basic question of our Mosaic AI o1-mini serving endpoint using AI_QUERY causes an error sometimes but not every time:

spark.sql("""
SELECT AI_QUERY(
        'o1-mini', 
        request => 'What is the capital of United Kingdom?',
        failOnError => false,
        modelParameters => named_struct('temperature', 1)
    )
 """).display()

Sometimes we get this:

"[REMOTE_FUNCTION_HTTP_FAILED_ERROR] The remote HTTP request failed with code 400, and error message 'HTTP request failed with status: {\"error_code\":\"BAD_REQUEST\",\"message\":\"{\\\\\"external_model_provider\\\\\":\\\\\"openai\\\\\",\\\\\"external_model_error\\\\\":{\\\\\"error\\\\\":{\\\\\"message\\\\\":\\\\\"Invalid prompt: your prompt was flagged as potentially violating our usage policy. Please try again with a different prompt: https://platform.openai.com/docs/guides/reasoning#advice-on-prompting\\\\\",\\\\\"type\\\\\":\\\\\"invalid_request_error\\\\\",\\\\\"param\\\\\":null,\\\\\"code\\\\\":\\\\\"invalid_prompt\\\\\"}}}\"}' SQLSTATE: 57012"

And other times an output as expected:

result: "The capital of the United Kingdom is **London**."
errorMessage: null

With the error suggesting the resource is not available:

We have also noticed a similar issue arise in the same past few days with embedding, but the error is different.

Running an OpenAI embedding model such as:

spark.sql("""
SELECT AI_QUERY(
        'text-embedding-ada-002',
        request => 'What is the capital of United Kingdom?',
        failOnError => false
    )
 """).display()

Now always yields:

[REMOTE_FUNCTION_HTTP_RESULT_UNEXPECTED_ERROR] Failed to evaluate the ai_query SQL function due to inability to process the unexpected remote HTTP response; the error message is 'Missing valid errors field in remote response.'. Check API documentation: https://docs.databricks.com/en/generative-ai/generative-ai.html. Please fix the problem indicated in the error message and retry the query again. SQLSTATE: 57012

It is worth noting that both of these models work fine through the machine learning playground.

Has anyone else experience this in the past few days? Might it be related to the March 2025 release of Azure Databricks?
Thanks everyone!

Louis_Frolio · ‎11-01-2025

Hello @jAAmes_bentley , I did some digging and here is what I found.

Root Cause: Unsupported Temperature Parameter

The primary issue with your `AI_QUERY` call to o1-mini is the temperature parameter. OpenAI's o1-series models (o1, o1-mini, o1-preview, o3-mini) do not support the `temperature` parameter at all—even when set to the default value of 1. When you include `modelParameters => named_struct('temperature', 1)`, it causes the OpenAI API to reject the request, resulting in the misleading "Invalid prompt" error.

The o1 reasoning models also don't support other sampling parameters like `top_p`, `presence_penalty`, `frequency_penalty`, and several others that work with GPT-4o and earlier models.

Solution for o1-mini

Remove the `modelParameters` argument entirely when calling o1-mini:

```python
spark.sql("""
SELECT AI_QUERY(
'o1-mini',
request => 'What is the capital of United Kingdom?',
failOnError => false
)
""").display()
```

This should resolve the intermittent 400 errors you're experiencing.

Embedding Model Issue

The `text-embedding-ada-002` error is separate and indicates a response parsing issue rather than a parameter problem. The error `Missing valid errors field in remote response` suggests that Databricks `AI_QUERY` is receiving an unexpected response format from the embedding endpoint.

For embeddings through `AI_QUERY`, ensure your endpoint configuration matches the expected response schema. Since the model works in the ML playground, this may be a temporary API integration issue between Databricks and Azure OpenAI that emerged in the March 2025 release timeframe.

Why It Works in ML Playground

The ML playground likely handles model-specific parameter restrictions automatically, stripping unsupported parameters before sending requests to OpenAI. The `AI_QUERY` function passes parameters directly, requiring you to manually ensure compatibility with each model's constraints.

Hope this helps, Louis.