on ‎01-10-2024 05:00 PM
Identifying the root cause of worker termination involves analyzing signals that can provide insights into the issue. Typically, these problems are associated with memory pressure, but understanding the specific events, workload type, and workload size is crucial for decoding the underlying problem.
A Workspace quota exhausted error message indicates that the default limit for provisioned concurrency has reached the maximum value of 200. This limit is determined by the highest number of concurrent requests that can be allocated across your endpoints. If an endpoint serves a model with a large size workload, supporting 16-64 concurrent requests, the maximum provisioned concurrency for that endpoint is 64. The cumulative default limit across all endpoints is 200.
If you need to extend this default limit, contact Databricks support for further assistance.