cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Serverless budget policies auto-application and resource limiting in Databricks

neeraj_borana
New Contributor III

Hi Team,

I am exploring serverless compute in Databricks and had a few questions related to governance and cost control.

We have multiple user groups in a workspace and are planning to move from all-purpose clusters to serverless compute. We understand that serverless budget policies can be used to tag and monitor serverless usage for cost attribution.

However, I would like to clarify:

  1. Is there any way to automatically apply a serverless budget policy based on user group membership, without requiring users to manually select the policy when creating notebooks, jobs, or pipelines?

  2. Apart from monitoring and tagging usage, is there any supported way to limit or cap serverless compute resources or cost per user or user group (for example, CPU, memory, concurrency, or spend limits)?

1 ACCEPTED SOLUTION

Accepted Solutions

Commitchell
Databricks Employee
Databricks Employee

Hi there,

That's great to hear that you're looking to use Serverless. It's way less overhead and a better use experience than classic compute.

To answer your questions:

  1. Going forward, users should be required to select a budget policy on all new notebooks, jobs, or pipelines. The best solution to avoid them having to manually select a policy is if they only have one available policy, it should be auto-selected by default. The caveat to this is any existing content from before the policy was applied. Our documentation states, "Existing notebooks, jobs, and Lakeflow Spark Declarative Pipelines are not automatically assigned policies after their owners are granted access to a policy. To add a serverless budget policy to an existing asset, you must manually update the asset's serverless budget policy setting in the UI." Let us know if you're experiencing something different!
  2. Currently, Serverless Budget Policies, which facilitate tracking and notifications, is the only capability we have in this area. I have heard limits or throttling is an area we are working on. Hopefully, we'll see something in this space in the upcoming roadmap webinar next week.

 

View solution in original post

3 REPLIES 3

Commitchell
Databricks Employee
Databricks Employee

Hi there,

That's great to hear that you're looking to use Serverless. It's way less overhead and a better use experience than classic compute.

To answer your questions:

  1. Going forward, users should be required to select a budget policy on all new notebooks, jobs, or pipelines. The best solution to avoid them having to manually select a policy is if they only have one available policy, it should be auto-selected by default. The caveat to this is any existing content from before the policy was applied. Our documentation states, "Existing notebooks, jobs, and Lakeflow Spark Declarative Pipelines are not automatically assigned policies after their owners are granted access to a policy. To add a serverless budget policy to an existing asset, you must manually update the asset's serverless budget policy setting in the UI." Let us know if you're experiencing something different!
  2. Currently, Serverless Budget Policies, which facilitate tracking and notifications, is the only capability we have in this area. I have heard limits or throttling is an area we are working on. Hopefully, we'll see something in this space in the upcoming roadmap webinar next week.

 

Hi,

Thanks for the detailed response โ€” that helps a lot.

I tried looking through the official documentation but couldnโ€™t find a specific reference that explicitly states that serverless compute resources (CPU, memory, concurrency, or spend) cannot be limited or capped today.

Could you please point me to any documentation or official reference that confirms this limitation, or should we treat this as a current product constraint thatโ€™s not yet documented formally?

Thanks again for your help.

While you can limit or cap resource utilization with "classic" compute (self-hosted) by setting cluster policies, it still takes a focus on fin/ops and end-user enablement to truly manage costs. Same as Serverless Budget Policies

The big draw of serverless compute is to get out of that infrastructure management. You shouldn't have to manage cores/memory/concurrency, it should just work. In this sense, I don't view it as a limitation and to answer your explicit question, there is no reference that I know of to this as a limitation. Keep in mind you still have controls, for example, to automatically terminate long-running jobs. Another big benefit is that serverless enables user-level attribution. Where classic compute took some creative reporting to attribute specific user behavior to a cost.

That being said, we have heard from many customers that have expressed a similar desire to have more control over their serverless costs, and I fully expect to see it on the roadmap sometime soon. Remember, even if serverless becomes your default, Databricks is committed to providing you choice. Classic compute, with a bit more control but with that more overhead, and serverless compute, less knobs and levers, but with some potentially serious performance/efficiency gains.