TLDR: You cannot increase the upstream gateway timeout in Databricks Apps. The best practice and quick solution to handle operations that take longer than the gateway limit is to implement a "status pull" (polling) pattern.
Why the Timeout Occurs Databricks Apps enforce strict ingress gateway timeouts to maintain platform stability. Increasing the keep-alive timeout in your FastAPI configuration only applies to the local container, not the Databricks ingress proxy that sits in front of it. When your request reaches the platform's hard limit (around 2 minutes), the gateway drops the connection and returns a 504 error, even if your backend task continues to run and eventually completes.
Recommended Solution: "Status Pull" Pattern To resolve this for production applications, Databricks best practices dictate that you should prefer a "status pull" over long-running synchronous connections.
You can quickly architect this by doing the following:
Trigger and Return: Modify your initial FastAPI endpoint to kick off your 3-minute task in the background and immediately return an HTTP response (such as a 202 Accepted) containing a unique tracking identifier (e.g., task_id).
Poll for Status: Create a secondary endpoint (e.g., /status/{task_id}) that checks the state of the background task and returns whether it is pending, processing, or complete.
Client-Side Updates: Configure your frontend to periodically ping the status endpoint (for example, once every 5 seconds) until the operation finishes and the final payload is retrieved.
This approach completely avoids the gateway timeout, frees up server resources, and aligns with the recommended runtime performance architecture for Databricks Apps.
Please accept the solution if the recommendation worked for you.