2 weeks ago
Hi all,
I set up a serverless endpoint for my agent via an S-account. This S-account is also granted access to another serverless endpoint. During the inference of my endpoint, I use the Databricks SDK to connect to the other endpoint without explicitly declaring the BEARER TOKEN:
```
```
I noticed that after more than 13 days without any updates of both endpoints. I got the access deny from the other endpoint. If I stop and then start my endpoint, it works again. May someone explain the token lifetime to me? Or on how does this happen? Thank you very much.
2 weeks ago
Hi @pikachu89,
From what I can see, this does not appear to be the expected 13-day bearer token lifetime. In Model Serving, the SDK is designed to resolve auth on each request, and in the serving runtime, the injected token is read from the mounted credentials file and only cached for 300 seconds before being reread. Given that restarting the endpoint restores functionality, this points more to stale runtime auth state than to an intended token expiry.
As a practical mitigation, I’d suggest creating the WorkspaceClient inside the inference path rather than at model load time, and if possible moving to the newer databricks-openai client since get_open_ai_client() is now deprecated.
If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.
a week ago
As I described earlier, the client is loaded only during inference time, not at model loading. Still I got the auth denied problem.
a week ago
Hi @pikachu89,
Thanks for the clarification. If the client is already being created during inference, then the usual stale-client explanation does not apply. Even then... this still does not look like an expected 13-day token lifetime. In Model Serving, the SDK resolves auth per request, and the serving runtime rereads the injected token source at a short cadence rather than holding a single token for days.
Given that a stop/start immediately fixes it, this points more to a server-side token refresh or a runtime auth issue than to something obviously wrong in your code. The best path is to raise a support ticket and submit the exact timestamp of the failure, the HTTP status and response body, request IDs, and confirmation that the downstream endpoint permissions for the service account did not change.
It is also worth moving off get_open_ai_client(), since that path is deprecated in favour of the newer Databricks OpenAI client.
If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.
a week ago
Thank you. Would you please point me to the usage/example of `databricks-openai` over `get_open_ai_client()`?
a week ago
Hi @pikachu89,
The equivalent pattern is client = DatabricksOpenAI(workspace_client=WorkspaceClient()) and then client.responses.create(model=..., input=...). The SDK deprecation note points to the Databricks OpenAI docs here: Databricks OpenAI Python docs.
If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.
a week ago
Hello !
Yes ! it seems so 😄
WorkspaceClient() is not anonymous and your get_open_ai_client() uses the DBKS auth attached to that WorkspaceClient to query serving endpoints. OpenAI client must be preconfigured with DBKS auth for the workspace client.
So if your endpoint process is long lived and the SDK/OpenAI client is created once and reused it may be holding a stale credential.
For SP or oauth , DBKS oauth access tokens are valid for 1 hour and unified authentication should request new tokens automatically when configured correctly. If you manually generate tokens, you must refresh them yourself.