Run pyspark queries from outside databricks
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-03-2024 08:49 AM - edited 09-03-2024 08:51 AM
I have written a Notebook which would execute pyspark query. I then execute it remotely from outside databricks environment using /api/2.1/jobs/run-now, which would then run the notebook. I also want to retrieve the results from this job execution. How should I do that?
response = requests.post(
f"{DATABRICKS_INSTANCE}/api/2.1/jobs/run-now",
headers={"Authorization": f"Bearer {API_TOKEN}"},
json={
"job_id": JOB_ID,
"notebook_params": {
"query": SQL_QUERY1
}
}
)
Notebook which would run pyspark
dbutils.widgets.text("query", "")
query = dbutils.widgets.get("query")
# Execute the query
spark = SparkSession.builder.getOrCreate()
df = spark.sql(query)
df.show()
# Return a value from the notebook
#dbutils.notebook.exit('hello!')
#return('Hello')