cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Generative AI
Explore discussions on generative artificial intelligence techniques and applications within the Databricks Community. Share ideas, challenges, and breakthroughs in this cutting-edge field.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

PERMISSION_DENIED: The endpoint is temporarily disabled due to a Databricks-set rate limit of 0

itssb
New Contributor

I'm new to databricks and been trying to test some AI models through the AI gateway, every model I try, it gives me error:

"error_code":"PERMISSION_DENIED","message":"PERMISSION_DENIED: The endpoint is temporarily disabled due to a Databricks-set rate limit of 0."}%

I tried to set the rate limits too, as shown in the picture, but that didn't work either. I have entered credit card details, so not a free trial account limit.

How can I fix this?

1 REPLY 1

Louis_Frolio
Databricks Employee
Databricks Employee

Greetings @itssb , I did some digging and here is what I found:

What you are seeing is a Databricks-imposed rate limit of 0, and that setting takes precedence over the endpoint- or user-level rate limits you configured in the UI. In other words, even if you set non-zero QPM or TPM values in Serving or AI Gateway, those settings will not override this restriction.

This is expected behavior for certain high-demand hosted models, including GPT-5.x and some Claude variants, when used from trial or Free Edition workspaces. In those cases, the workspace is often placed in a TRIAL_VERIFIED trust tier, which can block or heavily restrict access to premium models regardless of the limits shown in the UI.

The key point is this: the โ€œrate limit of 0โ€ error is not something that can be fixed by adjusting endpoint settings. It reflects a workspace-level access restriction for that model.

The path forward is one of the following:

  • Upgrade the workspace to a paid / fully enabled subscription

  • Work with Databricks Sales or Support to have the workspace converted so it is fully enabled for those premium hosted models

Once the workspace is moved to a PAYABLE_VERIFIED tier, this Databricks-set rate limit of 0 typically disappears, and the same endpoint will often begin working without any additional UI changes.

In the meantime, the practical workaround is to use open-source or otherwise non-gated models, such as Llama, which are not subject to this specific Databricks-imposed 0-rate-limit restriction.

Hope this helps, Louis.