cancel
Showing results for 
Search instead for 
Did you mean: 
Generative AI
Explore discussions on generative artificial intelligence techniques and applications within the Databricks Community. Share ideas, challenges, and breakthroughs in this cutting-edge field.
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks app calling Genie API has empty data array

lewissolarin
New Contributor

I have a Databricks app that makes calls to Genie API, it successfully returns a response containing the text response and the generated SQL query. However, the actual result of the SQL query is missing, in the result object, there is the metadata for the result but the data array is None.

result=ResultData(byte_count=None, chunk_index=0, data_array=None, external_links=[], next_chunk_index=None, next_chunk_internal_link=None, row_count=1, row_offset=0)

When I inspect in monitoring, there is also no SQL result whereas if I use the exact same prompt in the Genie web interface then there is a result (As shown in the attached screenshots).

Any idea of the issue here? 

3 REPLIES 3

pradeep_singh
Contributor

Can you inspect data_typed_array if data_array is None.

Thank You
Pradeep Singh - https://www.linkedin.com/in/dbxdev

anshu_roy
Databricks Employee
Databricks Employee

Hello,

 

You can use the Genie statement_id and the Statement Execution API to retrieve the SQL result set, then read result.result.data_array to build your DataFrame.

A working example is provided in the Databricks Apps Cookbook: https://apps-cookbook.dev/docs/dash/bi/genie_api/

Hope this helps!​

SteveOstrowski
Databricks Employee
Databricks Employee
Hi @lewissolarin,

The behavior you are seeing, where data_array is None and external_links is empty, is expected at a certain stage of the Genie API workflow. The Genie API uses the EXTERNAL_LINKS disposition for returning query results, which means data_array will always be None. The actual result data comes back through external_links (presigned URLs), but you need to call the right endpoint at the right time to get them.

Here is the typical end-to-end workflow when calling the Genie API from a Databricks App:


STEP 1: START A CONVERSATION OR SEND A MESSAGE

Use start_conversation (or create_message for follow-ups) and wait for the message to reach COMPLETED status:

from databricks.sdk import WorkspaceClient
import time

w = WorkspaceClient()

space_id = ""

# Start a new conversation
message = w.genie.start_conversation_and_wait(
space_id=space_id,
content="What were total sales last month?"
)


STEP 2: EXTRACT THE ATTACHMENT ID

Once the message is completed, look at the attachments to find the query attachment. The attachment contains the SQL that Genie generated. You need its ID to fetch the query result:

# Get the message details
msg = w.genie.get_message(
space_id=space_id,
conversation_id=message.conversation_id,
message_id=message.id
)

# Find the query attachment
attachment_id = None
if msg.attachments:
for att in msg.attachments:
if att.query:
attachment_id = att.id
break


STEP 3: FETCH THE QUERY RESULT

This is the step that is likely missing or incomplete in your code. You need to call get_message_attachment_query_result with the attachment_id:

if attachment_id:
result = w.genie.get_message_attachment_query_result(
space_id=space_id,
conversation_id=message.conversation_id,
message_id=message.id,
attachment_id=attachment_id
)

The result object is a GenieGetMessageQueryResultResponse. The actual data is in the statement_response field, which follows the same structure as the SQL Statement Execution API.


STEP 4: READ THE DATA FROM EXTERNAL LINKS

Because the Genie API uses EXTERNAL_LINKS disposition, you need to download the data from the presigned URLs provided in the result:

import requests
import json

# Access the result chunks
statement_response = result.statement_response
if statement_response and statement_response.result:
ext_links = statement_response.result.external_links
if ext_links:
for link in ext_links:
# Download the data (do NOT include Databricks auth headers)
resp = requests.get(link.external_link)
# Process the downloaded data based on format
data = resp.content


COMMON CAUSES OF EMPTY DATA

If you are still seeing empty external_links after following the workflow above, check the following:

1. TIMING: Make sure the message status is COMPLETED before fetching the query result. If you read the result while the query is still executing, the data will not be available yet. The start_conversation_and_wait method handles this polling for you, but if you are using the raw REST API, you need to poll get_message until the status shows COMPLETED.

2. ATTACHMENT ID: Confirm you are passing the correct attachment_id. Each message can have multiple attachments, and you specifically need the one with a query object.

3. PERMISSIONS: The app's service principal needs CAN RUN permission on the Genie Space, plus USE CATALOG, USE SCHEMA, and SELECT on the underlying tables. If permissions are insufficient, the query may complete but return no data.

4. DEPRECATED METHODS: If you are using get_message_query_result (without the attachment_id parameter), note that this method is deprecated. Switch to get_message_attachment_query_result which requires the attachment_id.

5. RE-EXECUTE IF EXPIRED: If the query result has expired, you can re-run it using execute_message_attachment_query with the same parameters.


ALTERNATIVE: USE THE DATABRICKS SDK HELPER

If you are using the Databricks Python SDK, you can also use the higher-level helpers that handle the full workflow for you, including polling and result retrieval.


DOCUMENTATION REFERENCES

Genie API reference:
https://docs.databricks.com/api/workspace/genie

Databricks Apps with Genie Spaces:
https://docs.databricks.com/aws/en/dev-tools/databricks-apps/genie.html

SQL Statement Execution API (for understanding EXTERNAL_LINKS disposition):
https://docs.databricks.com/aws/en/sql/admin/sql-execution-tutorial.html

Databricks Python SDK Genie module:
https://databricks-sdk-py.readthedocs.io/en/latest/workspace/dashboards/genie.html

* This reply used an agent system I built to research and draft this response based on the wide set of documentation I have available and previous memory. I personally review the draft for any obvious issues and for monitoring system reliability and update it when I detect any drift, but there is still a small chance that something is inaccurate, especially if you are experimenting with brand new features.

If this answer resolves your question, could you mark it as "Accept as Solution"? That helps other users quickly find the correct fix.