cancel
Showing results for 
Search instead for 
Did you mean: 
Generative AI
Explore discussions on generative artificial intelligence techniques and applications within the Databricks Community. Share ideas, challenges, and breakthroughs in this cutting-edge field.
cancel
Showing results for 
Search instead for 
Did you mean: 

Agent Bricks - MAS 500 Internal error

tsukitsune
New Contributor

Hi Databricks Team / Community,

I’m encountering a 500 Internal Server Error when calling an Agent Bricks MAS endpoint in my workspace. The error message is:

500 Internal Error. Please try again later. If this issue persists, please contact Databricks support.

Context:

  • I have deployed a multi-agent supervisor using Agent Bricks and exposed it as a serving endpoint.
  • I tried with 1~3 agents and all of them give the same error. Testing the agent endpoints separately works fine.

Troubleshooting I’ve Tried:

  • Verified workspace permissions; the token/user has access to all referenced models and tools.
  • Checked cluster status; compute resources appear healthy.
  • Re-deployed the endpoint to ensure the latest agent version is active.
  • Tested with smaller payloads.

I would appreciate guidance on:

  1. What could cause a 500 Internal Error in Agent Bricks endpoints?
  2. How to reliably debug or capture detailed logs for such failures.
  3. Any known limitations or workarounds for multi-agent endpoints causing 500 errors.

Thank you in advance for any help or insights!

1 REPLY 1

Louis_Frolio
Databricks Employee
Databricks Employee

Hi @tsukitsune ,  thanks for the detailed context—here’s a concise set of causes, diagnostics, and workarounds to get your multi-agent supervisor stable.

Likely root causes of 500 on a Multi‑Agent Supervisor (MAS)

  • Missing or misconfigured Agent Framework On‑Behalf‑Of (OBO) Authorization. MAS invokes sub‑agents with the caller’s permissions; OBO must be enabled and the MAS re‑created after toggling it.

  • Sub‑agent uses a disabled pay‑as‑you‑go (PayGo) model (e.g., Claude) or a model that’s not allowed in the workspace; MAS logs show PERMISSION_DENIED/Model disabled and bubble up as 500.

  • Intermittent infra issues or a prior MAS bug around parallel tool calls; a fix was shipped—updating the endpoint resolved repeated 500s in multiple workspaces.

  • Rate limiting can surface as 500 in some paths; ensure AI Gateway rate limits aren’t being hit by MAS traffic.

  • Serverless compute dependency missing in the workspace (MAS relies on serverless model serving).

  • Payload/response size or execution limits exceeded during orchestration (e.g., Genie returning large intermediate results). For agents, request payload limit is 4 MB, and responses >1 MB aren’t logged; max execution time per request is 297s.

    3 sources
  • Using unsupported sub‑agent types. MAS currently supports Agent Bricks: Knowledge Assistant endpoints (plus Genie, UC functions, and MCP servers). Custom code agents not created via Knowledge Assistant are not supported as “Agent Endpoint” in the MAS UI.

How to capture detailed logs and debug reliably

  • Pull model server logs for the served MAS entity via REST; these show runtime errors that lead to 500s:

    # Served model logs
    curl -H "Authorization: Bearer $TOKEN" \
      "https://<workspace-host>/api/2.0/serving-endpoints/<mas-endpoint>/served-models/<served-model-name>/logs?config_version=1"

    And container build logs:

    curl -H "Authorization: Bearer $TOKEN" \
      "https://<workspace-host>/api/2.0/serving-endpoints/<mas-endpoint>/served-models/<served-model-name>/build-logs?config_version=1"
  • Enable AI Gateway inference tables on the MAS endpoint; these log request/response payloads and MLflow traces for agents. Note: logging is best‑effort and may not populate for 500s; payloads >1 MiB won’t be logged.

  • Use MLflow 3 real‑time tracing for agent observability; MAS and sub‑agents log traces to an experiment and optionally to Delta tables for production monitoring.

  • Check endpoint health metrics (latency, error rate, QPS) and service logs in the Serving UI for runtime behavior and failures.

Known limitations and recommended workarounds

  • MAS supports up to 10 agents/tools; ensure each end user has explicit access to every sub‑agent (CAN QUERY for KA, Share for Genie, EXECUTE for UC functions, USE CONNECTION for MCP).

  • Knowledge Assistant embedding endpoint (databricks‑gte‑large‑en) must have AI Guardrails and rate limits disabled for ingestion; confirm this in Gateway settings.

  • MAS was not designed to pass large dataframes between Genie spaces; it routes and consolidates answers. If your Genie agent produces large intermediate data (e.g., 5000×22 rows), down‑sample/summarize in‑agent, or narrow the query so MAS handles smaller responses.

  • If OBO was toggled or workspace settings changed, re‑create MAS so it picks up auth and routing changes; also click Update Agent (or update the endpoint) to pull recent orchestration fixes that eliminated parallel‑call 500s.

  • Verify PayGo models are permitted if a sub‑agent relies on first‑party Claude/OpenAI endpoints; otherwise replace with allowed models or enable PayGo in the workspace.

Fast checklist to isolate your case

  • Confirm Agent Framework OBO is enabled and the MAS was re‑created after enabling it; retest.

  • Validate all sub‑agents are supported (KA endpoints, Genie rooms, UC functions, MCP servers) and end user permissions are set (CAN QUERY/Share/EXECUTE/USE CONNECTION).

  • Update the MAS endpoint (Configure tab → Update Agent) and retest to pick up the fix for parallel tool‑calling 500s.

  • Review Gateway rate limits and disable limits temporarily to rule out throttling; then re‑apply with safe headroom.

  • Keep MAS and sub‑agent request payloads under 4 MB and design Genie steps to summarize large outputs before returning to MAS.

  • Pull served‑model logs and build logs via the REST calls above; also enable inference tables and real‑time MLflow tracing for deeper RCA.

    4 sources

     

Hope this helps you get to a sound resolution.

Cheers, Louis.