Community Support Request: Integrating External API for Real-Time Data in Databricks One Chats

Vidu
New Contributor

 

🚀Community Support Request: External API Integration in Databricks One / Genie Chat


🎯Objective

I am trying to integrate external real-time data (via API) into Databricks One / Genie chat so that chat responses can combine:

  • Internal workspace data (Unity Catalog tables)

  • External API results (You.com research API)


🧠 Goal

Enable a workflow like:

User query → Chat → (Internal data + External API call) → Unified response


🏗️ What I Have Built So Far (Reproducible Setup)


1. External API Connection (HTTP)

Created a connection with the following details:

  • Name: youcom

  • Host: https://api.you.com

  • Base path: /v1

  • Authentication: Bearer token

  • MCP: Tested both enabled and disabled


2. Unity Catalog Function (API Wrapper)

Created a Unity Catalog function that calls the API using http_request.

Function details:

  • Name: workspace.default.search_web_research

  • Type: TABLE function

  • Language: SQL

Permissions granted:

  • USAGE on catalog and schema

  • EXECUTE on function

  • USE CONNECTION on youcom

  • Access to Genie space

  • Access to SQL warehouse

Validation:

The function works correctly when executed manually using a SQL query and returns expected API results.


3. Genie Space Setup

Created a Genie Space:

  • Name: NYC Taxi Trip Analysis

  • Data source: samples.nyctaxi.trips

  • Warehouse: Serverless

Added:

  • Unity Catalog function under “SQL queries & functions”

  • Example queries referencing the function

Observation:

  • Function is visible in Genie

  • However, it is never invoked during chat queries


4. Custom Agent (Tool Calling)

Built a custom agent using:

  • mlflow.pyfunc.ResponsesAgent

  • Tool-calling (OpenAI-style function schema)

  • Model: llama-3-3-70b

The tool function calls the You.com API using the same connection.

System prompt explicitly enforces:

“Always use the search_web_research function for every user query”


5. Model Serving Endpoint

Deployed the agent as a serving endpoint:

  • Status: Ready

  • Tool calling: Working

  • Direct endpoint invocation: Working

Tags used:

  • discoverable: databricks_one

  • tool_integration: youcom_api


⚠️Challenges / Observations

Despite all the above configurations:

  • No external API calls are triggered in Databricks One / Genie chat

  • Unity Catalog function is not executed

  • Agent endpoint is not invoked

  • Chat only uses internal tables or LLM knowledge


🎯Expected Behavior

When asking a question like:

“What are the latest AI trends in 2026?”

Expected behavior:

  • Chat invokes the Unity Catalog function or agent

  • External API is called

  • Response includes real-time external data


Actual Behavior

  • External API is never called

  • Function and agent are ignored

  • Only internal data or model knowledge is used


Questions

  1. Are Unity Catalog functions in Genie spaces only referenceable, or can they actually be executed by chat?

  2. Is there any way to force function invocation from Genie or Databricks One chat?

  3. Is MCP required for external API usage in chat workflows?

  4. Is MCP currently supported in Databricks One or Genie spaces?

  5. Can a Model Serving endpoint (agent) be registered as a callable tool in Databricks One chat?

  6. Do tags like “discoverable: databricks_one” have any functional impact, or are they only informational?

  7. Does Databricks currently support external API or tool calling in chat workflows?


🧠 Hypothesis

Based on all testing so far, it appears that:

Databricks One / Genie chat may not yet support dynamic tool or external API invocation, and is currently limited to internal data sources.

Would appreciate confirmation or correction.


🙏Request

If anyone has:

  • Successfully integrated external APIs into Databricks chat

  • Enabled tool or function calling

  • Or found a working workaround

Please share guidance, configuration steps, or examples.


💬Happy to Collaborate

I’m happy to share full code, test suggestions, and iterate further with the community.


🙌Thanks in advance!