cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

How to run a saved query from a Notebook (PySpark)

pacman
New Contributor

Hi Team! Noob to Databricks, so apologies if I ask a dumb question.

I have created a relatively large series of queries that fetch and organize the data I want.  I'm ready to drive all of these from a Notebook (likely PySpark).

An example query is saved as test1 under my user account (it shows up under the Queries heading in the left navbar).  I've also created a Notebook under that same account called Investigations (and both the notebook and query show up under Workspace).

My sample code is simple:

query_name = "test1"
results = spark.sql(f"SELECT * FROM {query_name}")

# Show the results
display(results)

 
But it produces the TABLE_OR_VIEW_NOT_FOUND error:

AnalysisException: [TABLE_OR_VIEW_NOT_FOUND] The table or view `test1` cannot be found. Verify the spelling and correctness of the schema and catalog. If you did not qualify the name with a schema, verify the current_schema() output, or qualify the name with the correct schema and catalog.

 
I'm sure the issue is just that I'm not properly referencing the query.  Help!!

2 REPLIES 2

Wojciech_BUK
Contributor III

Databricks is just pasting string value "test1" inside your code , so it is equivalent of running SQL command.

SELECT * FROM test1

So it is expecting to read data from existing table test1, but it can't find it

All tables can be referenced by providing catalog name , schema name and table name like: my_catalog.my_schema.test1

Or 

You can point your current session to use certain calatlog and schema , like : USE my_catalog.my_schema

Then it will always try to search a table in this schema and you can refere to table like you did.

I am not sure if this is what you are looking for but if it won't help,  please attach screeahot from your UI

Walter_C
Valued Contributor II
Valued Contributor II

The error you're encountering is because you're trying to execute a SQL query on a saved query name, which is not a table or view in your database. The spark.sql() function is used to run SQL queries directly on tables or views in your Spark session.

If you want to execute a saved query, you need to fetch the SQL text of the saved query first. However, Databricks does not provide a built-in function to fetch the SQL text of a saved query directly in a notebook.

As a workaround, you can manually copy the SQL text of your saved query and use it in your notebook. Here's an example:

 

# Replace this with the SQL text of your saved query
sql_text = """
SELECT *
FROM your_table
WHERE your_condition
"""

results = spark.sql(sql_text)

# Show the results
display(results)
 

 

In this example, replace sql_text with the actual SQL text of your saved query. This way, you're running the SQL query directly on your data, not trying to run it on the saved query name.

If you have a large number of saved queries and manually copying the SQL text is not feasible, you might need to consider using Databricks' REST API or CLI to automate the process of fetching the SQL text of your saved queries.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.