Hi, I'm super new to Databricks. I'm trying to do a little API scripting against my company's DB instance.
I have this supersimple python (ver 3) which is meant to run a remote host. The script tries to a simple SQL query against my Databricks instance:
import requests
import json
HOST = 'https://**********************************'
TOKEN = '**************************'
endpoint = f'{HOST}/api/2.0/sql/queries'
headers = {
'Authorization': f'Bearer {TOKEN}',
'Content-Type': 'application/json'
}
data = {
"query": "SELECT * FROM table1 LIMIT 10",
"data_source": {
"catalog_name": "hive_metastore",
"schema_name": "default"
},
"output_format": "json"
}
response = requests.post(endpoint, headers=headers, data=json.dumps(data))
if response.status_code == 200:
query_result = response.json()
print("Query executed successfully. Results:")
print(query_result)
else:
print(f"Failed to execute query. Status code: {response.status_code}, Message: {response.text}")
What could be simpler? "Give me the first ten rows of 'table1'."
(I have verified that the URL and token are valid.)
The problem is that when I run the program, Databricks doesn't like the formatting of the `data` JSON dictionary. Here's the error I see when the script runs:
Failed to execute query. Status code: 400, Message: {"error_code":"MALFORMED_REQUEST","message":"Could not parse request object: Expected 'START_OBJECT' not 'VALUE_STRING'\n at [Source: (ByteArrayInputStream); line: 1, column: 11]\n at [Source: java.io.ByteArrayInputStream@1cb8bdf; line: 1, column: 11]"}
I don't really understand what this error message means, but the AI chatbot I'm using assures me that Databricks is gagging on the format of my JSON `data` payload. "Consult the Databricks documentation" is its best advice.
So I guess I was hoping for some general advice. Is my approach naive? If so, could someone recommend an example program I could consult? thank you.