a month ago
I’ve deployed OpenAI’s Whisper model as a serving endpoint in Databricks and I’m trying to transcribe an audio file.
import whisper
model = whisper.load_model("small")
transcript = model.transcribe(
word_timestamps=True,
audio="path/to/audio"
)
In Databricks, I query the model like this:
response = workspace_client.serving_endpoints.query(
name="whisperv3",
inputs=[base64_audio_chunks[first_audio_key]]
)
print(response.predictions[0])
My question is: How can I enable word_timestamps=True when querying the model in Databricks?
Is there a way to pass this option in the inputs or another parameter?
a month ago
Hi @JoaoPigozzo
To enable word_timestamps=True when querying your Whisper model deployed as a serving endpoint in Databricks,
you must modify the serving endpoint’s inference logic to accept and process this parameter.
1. Update your model serving function to accept word_timestamps
Inside the model serving code you deployed (likely a Python function or MLflow model),
update the handler to read from the input payload:
def predict(model_input):
import whisper
import base64
import io
audio_data = base64.b64decode(model_input["audio"])
audio = whisper.load_audio(io.BytesIO(audio_data))
word_ts = model_input.get("word_timestamps", False)
model = whisper.load_model("small")
result = model.transcribe(audio=audio, word_timestamps=word_ts)
return result
Make sure the model is set up as a custom Python function model or MLflow pyfunc with
this logic in the predict() or model.predict() function.
2. Pass the parameter in your Databricks query:
response = workspace_client.serving_endpoints.query(
name="whisperv3",
inputs={
"audio": base64_audio_chunks[first_audio_key],
"word_timestamps": True
}
)
print(response.predictions[0])
Notes:
If you are using MLflow to deploy, your model must include this in the predict() method in your PythonModel wrapper.
If using a Databricks Model Serving endpoint via custom serving handler (e.g., Model Serving with a REST API),
your server must interpret the incoming JSON and apply word_timestamps=True to the call to Whisper.
a month ago - last edited a month ago
Hello @lingareddy_Alva.
Thank you for your response. I have deployed the Whisper model using the Serving UI, do I need to deploy with MLFlow or serve the model with a REST API?
a month ago
Hi @JoaoPigozzo
Since you've already deployed the Whisper model using the Serving UI in Databricks,
whether you also need to use MLflow or a REST API depends on your use case.
What the Serving UI Gives You:
When you deploy a model (like Whisper) using Databricks Model Serving UI, Databricks:
1. Automatically creates a REST API endpoint
2. Handles scaling, versioning, and deployment
3. Allows token-based secure access
4. Provides a sample curl command, Python code, and OpenAPI spec
So you do not need to redeploy it with MLflow unless you need MLflow tracking or experiments.
3 weeks ago
Hi @lingareddy_Alva. I appreciate your response.
I’ve looked for documentation but haven’t been able to find a solution. As you mentioned, I need to modify the serving endpoint’s inference logic to accept and handle this parameter. However, I don’t see any option to do that in the Serving UI, and I’m also unsure how to apply this update programmatically.
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now