cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Accessing delta tables using API outside azure (Workiva)

Shivaprasad
New Contributor III

I need to access delta tables with API outside azure using in a reporting tool workiva with using the connector. Can someone able to provide the details on how I can achieve it

10 REPLIES 10

Shivaprasad
New Contributor III

I need to access delta tables with API outside azure using in a reporting tool workiva without using the connector. Can someone able to provide the details on how I can achieve it

john533
New Contributor II

I need to use a reporting product called Workiva to retrieve delta tables via an API outside of Azure without the requirement for a connection. Could someone please provide me the specifics on how I may accomplish that?

 

RiyazAli
Valued Contributor II

Hi @Shivaprasad ,

Accessing Delta tables in Databricks from external tools or platforms requires using Databricks REST APIs or JDBC/ODBC connectors. Does your platform supports integration of code snippets?

Riz

Shivaprasad
New Contributor III

Looks like Within Workiva Wdata we can use python. I think I can use below code snippet but it does not give table data, Can I know what changes need to be done to retrieve table data   -

response = requests.get(endpoint, verify=False, headers=header)
 
what changes need to be done to retrieve table data 

 

Shivaprasad
New Contributor III

Hello @RiyazAli Thanks. I am thinking of using Databricks REST API's. I need to check on the reporting tool side to see whether it accepts code snippets. I have enabled unity catalog and I am planning to use below code on databricks side, getting ErrorClass=INVALID_PARAMETER_VALUE.LOCATION_OVERLAP] error message.

Any suggestion on what needs to be changed. Also what are my options on the Reporting tool end to retrieve table data  

Thanks

 

import requests
import json

TOKEN = 'xxxxxxx'
my_json = {"statement": "select * from delta.`abfss://databricks-storage@sa14127e1dv0101.dfs.core.windows.net/catalogs/catalogs/drm/__unitystorage/schemas/a8459dc4-6921-460b-a546-27e5cf370493`", "warehouse_id": "33dab3d9caf0e58b"}

auth = {'Authorization': 'Bearer %s' % TOKEN}

response = requests.post('https://adb-3007753333762912.12.azuredatabricks.net:443/api/2.0/sql/statements', json = my_json, headers=auth).json()
print(response)

RiyazAli
Valued Contributor II

Hello @Shivaprasad 

Looks like the error is associated with the way you're trying to retrieve the data from the Unity Storage location.

As an alternative, let's try to use the 3 level namespace UC has to submit a query and retrieve the data.

e.g: instead of 

 

SELECT * FROM delta.`<unity_storage_location>`

 

try:

 

SELECT * FROM <catalog_name>.<schema_name>.<table_name>

 

The rest api code snippet looks like below:

 

import requests

databricks_instance = "https://<databricks-instance>"

token = "<your_personal_access_token>"

warehouse_id = "<your_warehouse_id>"  
query = "SELECT * FROM <catalog_name>.<schema_name>.<table_name> LIMIT 10"  

url = f"{databricks_instance}/api/2.0/sql/statements"
data = {
    "statement": query,
    "warehouse_id": warehouse_id
}

response = requests.post(url, headers=headers, json=data)

if response.status_code == 200:
    statement_id = response.json()["statement_id"]
    print("Query submitted. Statement ID:", statement_id)
else:
    print("Error:", response.text)

 

Best,

Riz

RiyazAli
Valued Contributor II

Hello @Shivaprasad 

Expanding on the code snippet provided above. Once you run the Rest Api code provided above, you will get the statement id. 

Use this statement_id to get the query results using the same statements api.
code snippet is as below:

import time

# Replace with your statement ID from the previous step
statement_id = statement_id

url = f"{databricks_instance}/api/2.0/sql/statements/{statement_id}"

while True:
    response = requests.get(url, headers=headers)
    if response.status_code == 200:
        status = response.json()["status"]
        if status['state'] == "SUCCEEDED":
            display("Column Info:", response.json()['manifest']['schema']['columns'])
            display("Query Results:", response.json()["result"]['data_array'])
            break
        elif status['state'] == "FAILED":
            display("Query failed:", response.json())
            break
        else:
            print(f"Query in progress...: {status['state']}")
            time.sleep(5)
    else:
        print("Error:", response.text)
        break

The result will look like below:

RiyazAli_0-1735544063414.png

 

Riz

Shivap
New Contributor II

Hello @RiyazAli  Thanks and appreciate your help

RiyazAli
Valued Contributor II

Hi @Shivaprasad 
If you think the above response answered your question, please mark it as a solution. 

Best,

Riz

Shivap
New Contributor II

Sorry. for some reason I don't see the option to mark it as solution

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group