cancel
Showing results for 
Search instead for 
Did you mean: 
Generative AI
Explore discussions on generative artificial intelligence techniques and applications within the Databricks Community. Share ideas, challenges, and breakthroughs in this cutting-edge field.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to Increase HTTP Request Timeout for Databricks App Beyond 120 Seconds?

snarayan
New Contributor II

I’ve built a Databricks App using Gradio that leverages predict_stream to get streaming responses from a multi-agent supervisor. The app coordinates reasoning across four knowledge agents, so the model uses a long chain-of-thought process before returning a final answer.

Issue
Whenever the streaming response exceeds 120 seconds, the entire stream freezes. At that point, the logs also stop updating, which suggests the HTTP request is timing out. This is problematic because the reasoning process for complex queries often takes longer than two minutes. The rest of the app seem to work fine.

I’ve checked the app configuration but haven’t found any setting for request_timeout or similar. I’m not sure if this is something that needs to be configured in Model Serving, App settings, or elsewhere in Databricks.

What I’ve Tried

  • Verified the Gradio setup and streaming logic.
  • Looked through Databricks documentation for request timeout settings but couldn’t find anything specific for Apps or Model Serving.
  • Confirmed that the issue consistently occurs at exactly 120 seconds, which feels like a hard limit.

Question

  • Is there a way to increase the HTTP request timeout for Databricks Apps beyond 120 seconds?
  • Alternatively, is there a configuration to allow longer streaming responses from predict_stream in Model Serving?
  • Where should I set this—App settings, Model Serving endpoint, or somewhere else?

Any guidance or workaround would be greatly appreciated!

0 REPLIES 0