cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to execute .sql file in volume

lauraxyz
Contributor

I have giant queries (SELECT.. FROM) that i store in .sql files. I want to put those files in the Volume, and run the queries from a workflow task.

I can load the file content into a 'text' format string, then run the query.  My question is,  is there another option, where I don't need to load the file content, but directly execute the .sql file and store the results in a Dataframe?

 

 

 

8 REPLIES 8

JAHNAVI
Databricks Employee
Databricks Employee

Hi @lauraxyz , We can load the SQL file using dbutils.fs from volumes and then we can create a dataframe using spark.sql()

Example: 

sql_query = dbutils.fs.head("/Volumes/jahnavi/datasets/data/test.sql")
result_df = spark.sql(sql_query)
display(result_df)
Jahnavi N

Thanks Jahnavi! That's what i'm doing now, was wondering if there's a way that I don't need to parse the content of the file but directly execute it.    Another example is if i have a python notebook in Volume, and I want to directly execute this notebook without parsing the content, would I be able to do that?

JAHNAVI
Databricks Employee
Databricks Employee

@lauraxyz For SQL there is no direct way to run the file without parsing it. However, for Python, we can use %run to run the file from volumes.

Example:

%python
%run /Volumes/jahnavi/datasets/data/test.py
Jahnavi N

Thank you @JAHNAVI 
How about Python Notebooks? can we directly run .ipynb files?

JAHNAVI
Databricks Employee
Databricks Employee

Hi @lauraxyz , Good Day! 

We can run the below command to un .ipynb files

%python
%run /Volumes/jahnavi/datasets/data/test12.ipynb
Jahnavi N

lauraxyz
Contributor

Thanks gonna give it a try!

lauraxyz
Contributor

.ipynb cannot be applied due to ModuleNotFoundError: No module named 'nbformat'.

.py command seemed passed but the insertion was never executed, therefore it's a silent fail.

lauraxyz
Contributor

issue resolved:

for .py, i was using spark, and I have to explicitly create the spark session so that it can be run properly and insert data.