Databricks Community

JoaoPigozzo · ‎05-23-2025

I’ve deployed OpenAI’s Whisper model as a serving endpoint in Databricks and I’m trying to transcribe an audio file.

import whisper
 
model = whisper.load_model("small")
transcript = model.transcribe(
    word_timestamps=True,
    audio="path/to/audio"
)

In Databricks, I query the model like this:

response = workspace_client.serving_endpoints.query(
    name="whisperv3",
    inputs=[base64_audio_chunks[first_audio_key]]
)
print(response.predictions[0])

My question is: How can I enable word_timestamps=True when querying the model in Databricks?
Is there a way to pass this option in the inputs or another parameter?

lingareddy_Alva · ‎05-23-2025

Hi @JoaoPigozzo

To enable word_timestamps=True when querying your Whisper model deployed as a serving endpoint in Databricks,
you must modify the serving endpoint’s inference logic to accept and process this parameter.

1. Update your model serving function to accept word_timestamps
Inside the model serving code you deployed (likely a Python function or MLflow model),
update the handler to read from the input payload:

def predict(model_input):
import whisper
import base64
import io

audio_data = base64.b64decode(model_input["audio"])
audio = whisper.load_audio(io.BytesIO(audio_data))

word_ts = model_input.get("word_timestamps", False)

model = whisper.load_model("small")
result = model.transcribe(audio=audio, word_timestamps=word_ts)

return result

Make sure the model is set up as a custom Python function model or MLflow pyfunc with
this logic in the predict() or model.predict() function.

2. Pass the parameter in your Databricks query:

response = workspace_client.serving_endpoints.query(
name="whisperv3",
inputs={
"audio": base64_audio_chunks[first_audio_key],
"word_timestamps": True
}
)
print(response.predictions[0])

Notes:
If you are using MLflow to deploy, your model must include this in the predict() method in your PythonModel wrapper.
If using a Databricks Model Serving endpoint via custom serving handler (e.g., Model Serving with a REST API),
your server must interpret the incoming JSON and apply word_timestamps=True to the call to Whisper.

LR

JoaoPigozzo · ‎05-26-2025

Hello @lingareddy_Alva.

Thank you for your response. I have deployed the Whisper model using the Serving UI, do I need to deploy with MLFlow or serve the model with a REST API?

lingareddy_Alva · ‎05-26-2025

Hi @JoaoPigozzo

Since you've already deployed the Whisper model using the Serving UI in Databricks,
whether you also need to use MLflow or a REST API depends on your use case.

What the Serving UI Gives You:
When you deploy a model (like Whisper) using Databricks Model Serving UI, Databricks:
1. Automatically creates a REST API endpoint
2. Handles scaling, versioning, and deployment
3. Allows token-based secure access
4. Provides a sample curl command, Python code, and OpenAPI spec

So you do not need to redeploy it with MLflow unless you need MLflow tracking or experiments.

LR

JoaoPigozzo · ‎06-03-2025

Hi @lingareddy_Alva. I appreciate your response.

I’ve looked for documentation but haven’t been able to find a solution. As you mentioned, I need to modify the serving endpoint’s inference logic to accept and handle this parameter. However, I don’t see any option to do that in the Serving UI, and I’m also unsure how to apply this update programmatically.

Databricks Community

How to enable word_timestamps=True when querying a Whisper model deployed in Databricks?

Join Us as a Local Community Builder!

🌟 Community Pulse: Your Weekly Roundup! December 05 – 11, 2025

Jaipur Usergroup First Virtual Meetup: AI/BI Genie + Data Science Careers — 19 Dec | 6 PM IST

Lakehouse, Lagers & Legends — Bangalore Meetup | December 13

Celebrating Our First Brickster Champion: Louis Frolio

⭐ Setup Spark with Hadoop Anywhere : A DBR aligned local Spark+HDFS+Hive stack on Docker⭐