4 weeks ago
Hi Databricks Team / Community,
Iām encountering a 500 Internal Server Error when calling an Agent Bricks MAS endpoint in my workspace. The error message is:
500 Internal Error. Please try again later. If this issue persists, please contact Databricks support.
Context:
Troubleshooting Iāve Tried:
I would appreciate guidance on:
Thank you in advance for any help or insights!
3 weeks ago
Hi @tsukitsune , thanks for the detailed contextāhereās a concise set of causes, diagnostics, and workarounds to get your multi-agent supervisor stable.
Missing or misconfigured Agent Framework OnāBehalfāOf (OBO) Authorization. MAS invokes subāagents with the callerās permissions; OBO must be enabled and the MAS reācreated after toggling it.
Subāagent uses a disabled payāasāyouāgo (PayGo) model (e.g., Claude) or a model thatās not allowed in the workspace; MAS logs show PERMISSION_DENIED/Model disabled and bubble up as 500.
Intermittent infra issues or a prior MAS bug around parallel tool calls; a fix was shippedāupdating the endpoint resolved repeated 500s in multiple workspaces.
Rate limiting can surface as 500 in some paths; ensure AI Gateway rate limits arenāt being hit by MAS traffic.
Serverless compute dependency missing in the workspace (MAS relies on serverless model serving).
Payload/response size or execution limits exceeded during orchestration (e.g., Genie returning large intermediate results). For agents, request payload limit is 4 MB, and responses >1 MB arenāt logged; max execution time per request is 297s.
Using unsupported subāagent types. MAS currently supports Agent Bricks: Knowledge Assistant endpoints (plus Genie, UC functions, and MCP servers). Custom code agents not created via Knowledge Assistant are not supported as āAgent Endpointā in the MAS UI.
Pull model server logs for the served MAS entity via REST; these show runtime errors that lead to 500s:
# Served model logs
curl -H "Authorization: Bearer $TOKEN" \
"https://<workspace-host>/api/2.0/serving-endpoints/<mas-endpoint>/served-models/<served-model-name>/logs?config_version=1"
And container build logs:
curl -H "Authorization: Bearer $TOKEN" \
"https://<workspace-host>/api/2.0/serving-endpoints/<mas-endpoint>/served-models/<served-model-name>/build-logs?config_version=1"
Enable AI Gateway inference tables on the MAS endpoint; these log request/response payloads and MLflow traces for agents. Note: logging is bestāeffort and may not populate for 500s; payloads >1 MiB wonāt be logged.
Use MLflow 3 realātime tracing for agent observability; MAS and subāagents log traces to an experiment and optionally to Delta tables for production monitoring.
Check endpoint health metrics (latency, error rate, QPS) and service logs in the Serving UI for runtime behavior and failures.
MAS supports up to 10 agents/tools; ensure each end user has explicit access to every subāagent (CAN QUERY for KA, Share for Genie, EXECUTE for UC functions, USE CONNECTION for MCP).
Knowledge Assistant embedding endpoint (databricksāgteālargeāen) must have AI Guardrails and rate limits disabled for ingestion; confirm this in Gateway settings.
MAS was not designed to pass large dataframes between Genie spaces; it routes and consolidates answers. If your Genie agent produces large intermediate data (e.g., 5000Ć22 rows), downāsample/summarize ināagent, or narrow the query so MAS handles smaller responses.
If OBO was toggled or workspace settings changed, reācreate MAS so it picks up auth and routing changes; also click Update Agent (or update the endpoint) to pull recent orchestration fixes that eliminated parallelācall 500s.
Verify PayGo models are permitted if a subāagent relies on firstāparty Claude/OpenAI endpoints; otherwise replace with allowed models or enable PayGo in the workspace.
Confirm Agent Framework OBO is enabled and the MAS was reācreated after enabling it; retest.
Validate all subāagents are supported (KA endpoints, Genie rooms, UC functions, MCP servers) and end user permissions are set (CAN QUERY/Share/EXECUTE/USE CONNECTION).
Update the MAS endpoint (Configure tab ā Update Agent) and retest to pick up the fix for parallel toolācalling 500s.
Review Gateway rate limits and disable limits temporarily to rule out throttling; then reāapply with safe headroom.
Keep MAS and subāagent request payloads under 4 MB and design Genie steps to summarize large outputs before returning to MAS.
Pull servedāmodel logs and build logs via the REST calls above; also enable inference tables and realātime MLflow tracing for deeper RCA.
Hope this helps you get to a sound resolution.
Cheers, Louis.
3 weeks ago
Hi @tsukitsune , thanks for the detailed contextāhereās a concise set of causes, diagnostics, and workarounds to get your multi-agent supervisor stable.
Missing or misconfigured Agent Framework OnāBehalfāOf (OBO) Authorization. MAS invokes subāagents with the callerās permissions; OBO must be enabled and the MAS reācreated after toggling it.
Subāagent uses a disabled payāasāyouāgo (PayGo) model (e.g., Claude) or a model thatās not allowed in the workspace; MAS logs show PERMISSION_DENIED/Model disabled and bubble up as 500.
Intermittent infra issues or a prior MAS bug around parallel tool calls; a fix was shippedāupdating the endpoint resolved repeated 500s in multiple workspaces.
Rate limiting can surface as 500 in some paths; ensure AI Gateway rate limits arenāt being hit by MAS traffic.
Serverless compute dependency missing in the workspace (MAS relies on serverless model serving).
Payload/response size or execution limits exceeded during orchestration (e.g., Genie returning large intermediate results). For agents, request payload limit is 4 MB, and responses >1 MB arenāt logged; max execution time per request is 297s.
Using unsupported subāagent types. MAS currently supports Agent Bricks: Knowledge Assistant endpoints (plus Genie, UC functions, and MCP servers). Custom code agents not created via Knowledge Assistant are not supported as āAgent Endpointā in the MAS UI.
Pull model server logs for the served MAS entity via REST; these show runtime errors that lead to 500s:
# Served model logs
curl -H "Authorization: Bearer $TOKEN" \
"https://<workspace-host>/api/2.0/serving-endpoints/<mas-endpoint>/served-models/<served-model-name>/logs?config_version=1"
And container build logs:
curl -H "Authorization: Bearer $TOKEN" \
"https://<workspace-host>/api/2.0/serving-endpoints/<mas-endpoint>/served-models/<served-model-name>/build-logs?config_version=1"
Enable AI Gateway inference tables on the MAS endpoint; these log request/response payloads and MLflow traces for agents. Note: logging is bestāeffort and may not populate for 500s; payloads >1 MiB wonāt be logged.
Use MLflow 3 realātime tracing for agent observability; MAS and subāagents log traces to an experiment and optionally to Delta tables for production monitoring.
Check endpoint health metrics (latency, error rate, QPS) and service logs in the Serving UI for runtime behavior and failures.
MAS supports up to 10 agents/tools; ensure each end user has explicit access to every subāagent (CAN QUERY for KA, Share for Genie, EXECUTE for UC functions, USE CONNECTION for MCP).
Knowledge Assistant embedding endpoint (databricksāgteālargeāen) must have AI Guardrails and rate limits disabled for ingestion; confirm this in Gateway settings.
MAS was not designed to pass large dataframes between Genie spaces; it routes and consolidates answers. If your Genie agent produces large intermediate data (e.g., 5000Ć22 rows), downāsample/summarize ināagent, or narrow the query so MAS handles smaller responses.
If OBO was toggled or workspace settings changed, reācreate MAS so it picks up auth and routing changes; also click Update Agent (or update the endpoint) to pull recent orchestration fixes that eliminated parallelācall 500s.
Verify PayGo models are permitted if a subāagent relies on firstāparty Claude/OpenAI endpoints; otherwise replace with allowed models or enable PayGo in the workspace.
Confirm Agent Framework OBO is enabled and the MAS was reācreated after enabling it; retest.
Validate all subāagents are supported (KA endpoints, Genie rooms, UC functions, MCP servers) and end user permissions are set (CAN QUERY/Share/EXECUTE/USE CONNECTION).
Update the MAS endpoint (Configure tab ā Update Agent) and retest to pick up the fix for parallel toolācalling 500s.
Review Gateway rate limits and disable limits temporarily to rule out throttling; then reāapply with safe headroom.
Keep MAS and subāagent request payloads under 4 MB and design Genie steps to summarize large outputs before returning to MAS.
Pull servedāmodel logs and build logs via the REST calls above; also enable inference tables and realātime MLflow tracing for deeper RCA.
Hope this helps you get to a sound resolution.
Cheers, Louis.
2 weeks ago
Thanks @Louis_Frolio for the detailed response! The first tip on turning on the Agent Framework OnāBehalfāOf (OBO) Authorization resolved the issue. Cheers mate!
2 weeks ago
Glad you found a resolution! Cheers, Louis.
2 weeks ago
Hello, i'm facing the same issue while testing sample queries in the "Test your Agent" box.
Could anyone plese help me with the process of enabling OBO authorization
2 weeks ago
Update: Enabled OBO authorization but it still doesn't seem to resolve the issue. Also cross checked compute and other requirements.
2 weeks ago
@shivamrai162 , Did you recreate the agent after enabling the preview?
2 weeks ago
Thanks Kaushal, I tried recreating it again and its working now.
2 weeks ago
Good to know it's working now @shivamrai162
Passionate about hosting events and connecting people? Help us grow a vibrant local communityāsign up today to get started!
Sign Up Now