Databricks Community

frankc · ‎08-07-2025

I was looking forward to using there openAI responses API when the new gpt-oss models were added to Databricks.

The openAI website states these model are compatible with the responses API.

However, it does not seem to support it on Databricks. See error message below when trying it out with both the gpt-oss end points.

Any idea why the API is not included?

Error code: 400 - {'error_code': 'BAD_REQUEST', 'message': 'BAD_REQUEST: Invalid endpoint type: responses is not supported by gpt-oss-20b.'}

WiliamRosa · ‎08-22-2025

Hi @frankc,
Could you please share more details about your implementation? I ran some tests in my environment using the playground and Python code and was successful:

from openai import OpenAI
import os

# How to get your Databricks token: https://docs.databricks.com/en/dev-tools/auth/pat.html
DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')
# Alternatively in a Databricks notebook you can use this:
# DATABRICKS_TOKEN = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiToken().get()

client = OpenAI(
    api_key=DATABRICKS_TOKEN,
    base_url="https://dbc-1a2e360a-8cca.cloud.databricks.com/serving-endpoints"
)

response = client.chat.completions.create(
    model="databricks-gpt-oss-20b",
    messages=[
        {
            "role": "user",
            "content": "Hello!"
        }
    ],
    max_tokens=5000
)

print(response.choices[0].message.content)

Wiliam Rosa
Data Engineer | Machine Learning Engineer
LinkedIn: linkedin.com/in/wiliamrosa

frankc · ‎08-25-2025

Hi William,

I was asking about the OpenAI responses API. Note, you are using there chat completions API.

Here is a link to what I am referring to: https://platform.openai.com/docs/guides/text?api-mode=responses

Per my understanding, the only way to call the openAI responses API is to use an external serving endpoint. Then you have to setup an API key, etc with openAI. I was looking to use the Frontier models hosted by Databricks. The data is not shared externally.

Hope this helps clarify. Let me know of any further questions. It would be great if the responses API is available on Databricks.

Regards,

Frank

powerofzero · ‎09-04-2025

Hi Frank, I too have been waiting for responses API to become available, but with Azure. I'm sorta suprised more people haven't been asking for it. In one of their first blogs about it in early August, Azure said they would be making responses API available to gpt-oss but I am yet to see it.

If it's any consolation, I did find it is supported on Cloudflare. I have tested it with gpt-oss-120b along with their Autorag implementation and it was working well. However, it does have a lot of limitations such as rate limiting, restrictions on file size for RAG etc, but it might suffice for you PoC.

Louis_Frolio · 4 weeks ago

Greetings @frankc ,

The Responses API is not currently supported for gpt-oss models (or any models) on Databricks Foundation Model APIs, despite OpenAI stating that gpt-oss models are compatible with it. This is a platform limitation specific to Databricks' implementation, not a model limitation.

Why the Responses API Isn't Available

Databricks Foundation Model APIs are designed to be OpenAI-compatible, but they currently only support the Chat Completions API endpoint (`/v1/chat/completions`), not the newer Responses API endpoint (`/v1/responses`). When Databricks hosts models through their Foundation Model serving infrastructure, they implement specific API endpoints, and the Responses API simply hasn't been added to their supported endpoint types yet.

Key Differences Between the APIs

The Responses API is OpenAI's newer interface that offers:

- Native conversation state management: Automatically stores conversation history using `previous_response_id` instead of requiring you to manually pass message arrays
- Simplified input format: Uses `input` parameter instead of `messages` array
- First-class tool support: Better integration for agentic workflows
- Built-in storage: Responses are stored by default for easy retrieval

The Chat Completions API requires you to manually manage conversation state by passing the full message history with each request.

Current Workarounds

Since the Responses API isn't available on Databricks, you'll need to continue using the Chat Completions API:

```python
from openai import OpenAI
import os

client = OpenAI(
api_key=os.environ.get('DATABRICKS_TOKEN'),
base_url="https://your-workspace.cloud.databricks.com/serving-endpoints"
)

response = client.chat.completions.create(
model="databricks-gpt-oss-20b",
messages=[
{"role": "user", "content": "Your prompt here"}
],
max_tokens=5000
)
```

When Will It Be Available?

There's no official timeline from Databricks for Responses API support. This appears to be a feature request that the community has raised. If you need the Responses API specifically, your options are:

- Use OpenAI directly: Create an external model serving endpoint pointing to OpenAI's API (though this means data leaves Databricks)
- Use alternative cloud providers: Cloudflare Workers AI reportedly supports the Responses API for gpt-oss models, though with limitations on rate limits and file sizes
- Wait for Databricks implementation: Continue monitoring Databricks announcements and community forums

You may want to upvote or add your use case to the community forum thread to help prioritize this feature with Databricks engineering.

Hope this helps, Louis.

CharlotteMarti2 · 3 weeks ago

That’s expected — Databricks’ current GPT-OSS integration only supports /completions and /chat/completions endpoints, not the /responses API yet. The OpenAI Responses API compatibility mentioned on OpenAI’s site applies to OpenAI-hosted models, not third-party deployments. You’ll need to use the chat completions format for now until Databricks updates their connector to include responses support.