cancel
Showing results for 
Search instead for 
Did you mean: 
Generative AI
Explore discussions on generative artificial intelligence techniques and applications within the Databricks Community. Share ideas, challenges, and breakthroughs in this cutting-edge field.
cancel
Showing results for 
Search instead for 
Did you mean: 

MCP on Databricks

sagarpgowda7777
Visitor

Here’s What Nobody Tells You.
A hands-on look at Genie MCP and DBSQL MCP — what works, what doesn’t, and when to skip MCP entirely.
Let me start with something most MCP content skips. MCP servers don’t just expose tools. They expose three things — tools, prompts, and resources. If you only think about tools, you’ll miss why some things break in ways that aren’t obvious.
I validated this on two Databricks MCP servers: Genie and DBSQL. Here’s what I actually found.
Genie MCP — The Idea Is Right. The Execution Needs Time.
I went in wanting one thing: route queries across multiple Genie spaces without writing routing code for each one. That part does work. But everything around it is still rough.
Query Space
Genie MCP exposes descriptions of your Genie spaces as resources so the LLM can decide which space to query. In theory, smart. In practice, the descriptions are too generic. If you have three Genie spaces and their descriptions are all high-level overviews, the model doesn’t have enough signal to route correctly. You get misrouting or arbitrary selection. This will improve as the feature matures — but right now, you need to write much more specific descriptions manually to make routing reliable.
Poll Responses
Genie is async. You fire a query, get a conversation ID, then poll until the response is ready. The MCP server exposes this flow, but the polling configuration is still on you. You still need retry logic, timeout handling, error cases. MCP wraps the API call — it doesn’t solve the async complexity underneath.
Statelessness — This Is the Real Gap
Every Genie MCP call starts a new session. No state carried across tool invocations. If your agent needs to follow up based on a previous answer, that context is gone. For any conversational agent, this is a hard limitation. The direct Genie Conversation API handles this better because you manage the conversation ID yourself — more code, but you own the state.

Area:What’s missing today

Query Space :Descriptions too generic for reliable LLM routing

Poll Responses :Polling still needs to be configured manually — MCP doesn’t abstract that away

Statelessness Each call starts a new session — no conversation continuity across turns

Best workaround Custom MCP server wrapping the Genie API — routing benefits, full control over state and polling

Where Genie MCP Actually Wins

Multiple Genie spaces. If you’re orchestrating across three or four spaces, the multi-server MCP pattern lets the LLM route from server config. Writing that same thing over raw API means a tool function per space, explicit routing conditionals, and ongoing maintenance. MCP removes that layer.

There’s also a third option worth considering: a custom MCP server wrapping the Genie API. You get the routing benefit of MCP without giving up control over state and polling. That’s the direction I’ve been moving toward.

DBSQL MCP — This One’s Ready

Different story here. DBSQL MCP feels production-ready.

Three tools: execute_sql, execute_sql_read_only, and poll_sql_result. The read-only tool is what I’d default to for agentic workflows — full query capability with no risk of an agent running an accidental write.

Tool Use it when

execute_sql: You need full read/write in a controlled pipeline

execute_sql_read_only :Default for agents, VS Code/Copilot integrations, anything exploratory

poll_sql_result :Query is async or long-running — fetch results after completion

Unity Catalog Integration — This Is What Makes It Work

The MCP server respects your UC permissions exactly. You see the tables you have access to. Masked columns stay masked. Encryption policies hold. It doesn’t give the LLM more access than you’d have in the Databricks UI yourself. That governance consistency matters when you’re connecting external tools to your lakehouse.

Authentication is PAT or SSO. The server’s access boundary is your workspace access. Nothing more.

Practically, this means you can query your lakehouse directly from VS Code or GitHub Copilot while writing code, feed governed table data into an agentic reasoning loop, or let external tools run SQL without needing direct cluster access — all while staying within whatever access policies you’ve already set in Unity Catalog.

The Bottom Line

DBSQL MCP is worth using today. Genie MCP is worth watching, but statelessness, routing quality, and the polling gap mean the direct API is still the better call if you need reliability and control.

The exception is multi-space routing. If that’s your main problem, Genie MCP saves real work. And if you need both — routing and control — build a custom MCP layer over the API.

The ecosystem is moving fast. Some of what I’ve described here may already look different by the time you read this. But this is what I found when I actually tested it.

Test it yourself. Don’t build on assumptions.

#MCP #Databricks #AIEngineering #AgenticAI #DataEngineering #LangGraph

 

0 REPLIES 0