08-07-2025 06:47 PM
I was looking forward to using there openAI responses API when the new gpt-oss models were added to Databricks.
The openAI website states these model are compatible with the responses API.
However, it does not seem to support it on Databricks. See error message below when trying it out with both the gpt-oss end points.
Any idea why the API is not included?
Error code: 400 - {'error_code': 'BAD_REQUEST', 'message': 'BAD_REQUEST: Invalid endpoint type: responses is not supported by gpt-oss-20b.'}
08-22-2025 05:58 PM
Hi @frankc,
Could you please share more details about your implementation? I ran some tests in my environment using the playground and Python code and was successful:
from openai import OpenAI
import os
# How to get your Databricks token: https://docs.databricks.com/en/dev-tools/auth/pat.html
DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')
# Alternatively in a Databricks notebook you can use this:
# DATABRICKS_TOKEN = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiToken().get()
client = OpenAI(
api_key=DATABRICKS_TOKEN,
base_url="https://dbc-1a2e360a-8cca.cloud.databricks.com/serving-endpoints"
)
response = client.chat.completions.create(
model="databricks-gpt-oss-20b",
messages=[
{
"role": "user",
"content": "Hello!"
}
],
max_tokens=5000
)
print(response.choices[0].message.content)08-25-2025 06:32 PM - edited 08-25-2025 06:53 PM
Hi William,
I was asking about the OpenAI responses API. Note, you are using there chat completions API.
Here is a link to what I am referring to: https://platform.openai.com/docs/guides/text?api-mode=responses
Per my understanding, the only way to call the openAI responses API is to use an external serving endpoint. Then you have to setup an API key, etc with openAI. I was looking to use the Frontier models hosted by Databricks. The data is not shared externally.
Hope this helps clarify. Let me know of any further questions. It would be great if the responses API is available on Databricks.
Regards,
Frank
09-04-2025 08:11 PM
Hi Frank, I too have been waiting for responses API to become available, but with Azure. I'm sorta suprised more people haven't been asking for it. In one of their first blogs about it in early August, Azure said they would be making responses API available to gpt-oss but I am yet to see it.
If it's any consolation, I did find it is supported on Cloudflare. I have tested it with gpt-oss-120b along with their Autorag implementation and it was working well. However, it does have a lot of limitations such as rate limiting, restrictions on file size for RAG etc, but it might suffice for you PoC.
Sunday
Greetings @frankc ,
The Responses API is not currently supported for gpt-oss models (or any models) on Databricks Foundation Model APIs, despite OpenAI stating that gpt-oss models are compatible with it. This is a platform limitation specific to Databricks' implementation, not a model limitation.
Databricks Foundation Model APIs are designed to be OpenAI-compatible, but they currently only support the Chat Completions API endpoint (`/v1/chat/completions`), not the newer Responses API endpoint (`/v1/responses`). When Databricks hosts models through their Foundation Model serving infrastructure, they implement specific API endpoints, and the Responses API simply hasn't been added to their supported endpoint types yet.
The Responses API is OpenAI's newer interface that offers:
- Native conversation state management: Automatically stores conversation history using `previous_response_id` instead of requiring you to manually pass message arrays
- Simplified input format: Uses `input` parameter instead of `messages` array
- First-class tool support: Better integration for agentic workflows
- Built-in storage: Responses are stored by default for easy retrieval
The Chat Completions API requires you to manually manage conversation state by passing the full message history with each request.
Since the Responses API isn't available on Databricks, you'll need to continue using the Chat Completions API:
```python
from openai import OpenAI
import os
client = OpenAI(
api_key=os.environ.get('DATABRICKS_TOKEN'),
base_url="https://your-workspace.cloud.databricks.com/serving-endpoints"
)
response = client.chat.completions.create(
model="databricks-gpt-oss-20b",
messages=[
{"role": "user", "content": "Your prompt here"}
],
max_tokens=5000
)
```
There's no official timeline from Databricks for Responses API support. This appears to be a feature request that the community has raised. If you need the Responses API specifically, your options are:
- Use OpenAI directly: Create an external model serving endpoint pointing to OpenAI's API (though this means data leaves Databricks)
- Use alternative cloud providers: Cloudflare Workers AI reportedly supports the Responses API for gpt-oss models, though with limitations on rate limits and file sizes
- Wait for Databricks implementation: Continue monitoring Databricks announcements and community forums
You may want to upvote or add your use case to the community forum thread to help prioritize this feature with Databricks engineering.
Hope this helps, Louis.
Tuesday
That’s expected — Databricks’ current GPT-OSS integration only supports /completions and /chat/completions endpoints, not the /responses API yet. The OpenAI Responses API compatibility mentioned on OpenAI’s site applies to OpenAI-hosted models, not third-party deployments. You’ll need to use the chat completions format for now until Databricks updates their connector to include responses support.
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now