Databricks Community

cbossi · Wednesday

Hi all,

I'm new to Databricks so would appreciate some advice.

I have a ML model deployed using Databricks Model Serving. My use case is very sporadic: I only need to make 5–15 prediction requests per day (industrial application), and there can be long idle periods between requests. I’ve noticed that after a cold start, the serving cluster stays up for at least 30 minutes (the minimum idle timeout), and I am billed for this entire period, even if no further requests are made.

Is there any way to serve models on Databricks where I only pay for actual requests (compute time), and not for idle time? Or are there recommended alternatives, perhaps via integration with other Azure services?

Thanks for any advice!

KaushalVachhani · Wednesday

Hi @cbossi , You are right!

A 30-minute idle period precedes the endpoint's scaling down. You are billed for the compute resources used during this period, plus the actual serving time when requests are made. This is the current expected behaviour. You cannot currently reduce the idle timeout to less than 30 minutes.

If your use case does not require real-time request prediction, it is better to use a batch prediction by accumulating requests throughout the day and running them all at once. Alternatively, you can explore Azure Functions to host the model.

View solution in original post

KaushalVachhani · Wednesday

Hi @cbossi , You are right!

A 30-minute idle period precedes the endpoint's scaling down. You are billed for the compute resources used during this period, plus the actual serving time when requests are made. This is the current expected behaviour. You cannot currently reduce the idle timeout to less than 30 minutes.

If your use case does not require real-time request prediction, it is better to use a batch prediction by accumulating requests throughout the day and running them all at once. Alternatively, you can explore Azure Functions to host the model.

Databricks Community

Options sporadic (and cost-efficient) Model Serving on Databricks?

Join Us as a Local Community Builder!

🌟 Community Pulse: Your Weekly Roundup! October 31 – November 06, 2025

Free Edition Hackathon

🚀 Announcing the Databricks Data Intelligence Platform Cheat Sheet

Zerobus Ingest in Action: How to Stream Event Data Directly into Your Lakehouse

Find Sensitive Data at Scale with Data Classification in Unity Catalog