Accessing delta tables using API outside azure (Workiva)

Shivaprasad
Contributor

I need to access delta tables with API outside azure using in a reporting tool workiva with using the connector. Can someone able to provide the details on how I can achieve it

Shivaprasad
Contributor

I need to access delta tables with API outside azure using in a reporting tool workiva without using the connector. Can someone able to provide the details on how I can achieve it

john533
New Contributor III

I need to use a reporting product called Workiva to retrieve delta tables via an API outside of Azure without the requirement for a connection. Could someone please provide me the specifics on how I may accomplish that?

 

RiyazAliM
Honored Contributor

Hi @Shivaprasad ,

Accessing Delta tables in Databricks from external tools or platforms requires using Databricks REST APIs or JDBC/ODBC connectors. Does your platform supports integration of code snippets?

Riz

Looks like Within Workiva Wdata we can use python. I think I can use below code snippet but it does not give table data, Can I know what changes need to be done to retrieve table data   -

response = requests.get(endpoint, verify=False, headers=header)
 
what changes need to be done to retrieve table data 

 

Shivaprasad
Contributor

Hello @RiyazAliM Thanks. I am thinking of using Databricks REST API's. I need to check on the reporting tool side to see whether it accepts code snippets. I have enabled unity catalog and I am planning to use below code on databricks side, getting ErrorClass=INVALID_PARAMETER_VALUE.LOCATION_OVERLAP] error message.

Any suggestion on what needs to be changed. Also what are my options on the Reporting tool end to retrieve table data  

Thanks

 

import requests
import json

TOKEN = 'xxxxxxx'
my_json = {"statement": "select * from delta.`abfss://databricks-storage@sa14127e1dv0101.dfs.core.windows.net/catalogs/catalogs/drm/__unitystorage/schemas/a8459dc4-6921-460b-a546-27e5cf370493`", "warehouse_id": "33dab3d9caf0e58b"}

auth = {'Authorization': 'Bearer %s' % TOKEN}

response = requests.post('https://adb-3007753333762912.12.azuredatabricks.net:443/api/2.0/sql/statements', json = my_json, headers=auth).json()
print(response)

Hello @Shivaprasad 

Looks like the error is associated with the way you're trying to retrieve the data from the Unity Storage location.

As an alternative, let's try to use the 3 level namespace UC has to submit a query and retrieve the data.

e.g: instead of 

 

SELECT * FROM delta.`<unity_storage_location>`

 

try:

 

SELECT * FROM <catalog_name>.<schema_name>.<table_name>

 

The rest api code snippet looks like below:

 

import requests

databricks_instance = "https://<databricks-instance>"

token = "<your_personal_access_token>"

warehouse_id = "<your_warehouse_id>"  
query = "SELECT * FROM <catalog_name>.<schema_name>.<table_name> LIMIT 10"  

url = f"{databricks_instance}/api/2.0/sql/statements"
data = {
    "statement": query,
    "warehouse_id": warehouse_id
}

response = requests.post(url, headers=headers, json=data)

if response.status_code == 200:
    statement_id = response.json()["statement_id"]
    print("Query submitted. Statement ID:", statement_id)
else:
    print("Error:", response.text)

 

Best,

Riz

RiyazAliM
Honored Contributor

Hello @Shivaprasad 

Expanding on the code snippet provided above. Once you run the Rest Api code provided above, you will get the statement id. 

Use this statement_id to get the query results using the same statements api.
code snippet is as below:

import time

# Replace with your statement ID from the previous step
statement_id = statement_id

url = f"{databricks_instance}/api/2.0/sql/statements/{statement_id}"

while True:
    response = requests.get(url, headers=headers)
    if response.status_code == 200:
        status = response.json()["status"]
        if status['state'] == "SUCCEEDED":
            display("Column Info:", response.json()['manifest']['schema']['columns'])
            display("Query Results:", response.json()["result"]['data_array'])
            break
        elif status['state'] == "FAILED":
            display("Query failed:", response.json())
            break
        else:
            print(f"Query in progress...: {status['state']}")
            time.sleep(5)
    else:
        print("Error:", response.text)
        break

The result will look like below:

RiyazAli_0-1735544063414.png

 

Riz

Shivap
New Contributor III

Hello @RiyazAliM  Thanks and appreciate your help

RiyazAliM
Honored Contributor

Hi @Shivaprasad 
If you think the above response answered your question, please mark it as a solution. 

Best,

Riz

Shivap
New Contributor III

Sorry. for some reason I don't see the option to mark it as solution

jack533
New Contributor III

 

I need to retrieve delta tables using an API outside of Azure without requiring a connection using a reporting tool named Workiva. I would appreciate it if someone could tell me exactly how to do that.