Databricks Community

redapplesonly · ‎10-17-2024

Hi, I'm super new to Databricks. I'm trying to do a little API scripting against my company's DB instance.
I have this supersimple python (ver 3) which is meant to run a remote host. The script tries to a simple SQL query against my Databricks instance:

import requests
import json

HOST = 'https://**********************************'
TOKEN = '**************************'
endpoint = f'{HOST}/api/2.0/sql/queries'

headers = {
    'Authorization': f'Bearer {TOKEN}',
    'Content-Type': 'application/json'
}

data = {
    "query": "SELECT * FROM table1 LIMIT 10",
    "data_source": {
        "catalog_name": "hive_metastore",
        "schema_name": "default"
    },
    "output_format": "json"
}

response = requests.post(endpoint, headers=headers, data=json.dumps(data))

if response.status_code == 200:
    query_result = response.json()
    print("Query executed successfully. Results:")
    print(query_result)
else:
    print(f"Failed to execute query. Status code: {response.status_code}, Message: {response.text}")

What could be simpler? "Give me the first ten rows of 'table1'."
(I have verified that the URL and token are valid.)
The problem is that when I run the program, Databricks doesn't like the formatting of the `data` JSON dictionary. Here's the error I see when the script runs:

Failed to execute query. Status code: 400, Message: {"error_code":"MALFORMED_REQUEST","message":"Could not parse request object: Expected 'START_OBJECT' not 'VALUE_STRING'\n at [Source: (ByteArrayInputStream); line: 1, column: 11]\n at [Source: java.io.ByteArrayInputStream@1cb8bdf; line: 1, column: 11]"}

I don't really understand what this error message means, but the AI chatbot I'm using assures me that Databricks is gagging on the format of my JSON `data` payload. "Consult the Databricks documentation" is its best advice.

So I guess I was hoping for some general advice. Is my approach naive? If so, could someone recommend an example program I could consult? thank you.