cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Alternative of spark.sql.globalTempDatabase

Saf4Databricks
Contributor

Question: Since I'm using Databricks Free Edition that uses only serverless cluster, I cannot use spark.sql.globalTempDatabase in my code below. What's an alternative solution for the Caller_Notebook below. Following error occurred in the second line of this notebook: globalTempDatabase is not available in serverless compute environment.

I'm working on passing structured data between two notebooks. In the following code, data from called notebook is supposed to return data (1 column with 5 rows) to a caller notebook:

Called_Notebook:

spark.range(5).toDF("MyCol").createOrReplaceTempView("MyView")
dbutils.notebook.exit("my_data")

Caller_Notebook:

returned_table = dbutils.notebook.run("./Called_Notebook", 60)
My_temp_db = spark.conf.get("spark.sql.globalTempDatabase")
display(table(My_temp_db + "." + returned_table))

 

8 REPLIES 8

balajij8
Contributor

You can create tables if large data is involved or use session temporary views for data passing if its conditions are met.

Details here 

Hi @balajij8, As you can see in Called_Notebook, I'm using temporary view. But the issue is in the first line of Caller_Notebook where I'm trying to access temp db so I can use it as schema.My_View in the second line:

My_temp_db = spark.conf.get("spark.sql.globalTempDatabase")
display(table(My_temp_db + "." + returned_table))

 

balajij8
Contributor

Hi @Saf4Databricks As you can see hereGlobal temp views are not supported in serverless. You can remove global temp database & temp db in the code as its not supported.

To use the session temporary views, you can run the notebook to keep the view in session %run "./Notebook" and use the view myview directly in table.

Caller Notebook

%run "./Notebook1"
display(table("myview"))

Hi @balajij8 : Your suggested code of the Caller_Notebook returns the following error [I'm using the name as my_view]:

The table or view `my_view` cannot be found. Verify the spelling and correctness of the schema and catalog. If you did not qualify the name with a schema, verify the current_schema() output, or qualify the name with the correct schema and catalog. To tolerate the error on drop use DROP VIEW IF EXISTS or DROP TABLE IF EXISTS. SQLSTATE: 42P01

NageshPatil
New Contributor III

Hi @Saf4Databricks ,

As mentioned in databricks documentation and also by @balajij8 , you can only use temp table with SQL warehouse cluster. if you are working with non-serverless cluster i.e. interactive cluster then simply run your Called_Notebook in your Caller_Notebook in first cell.
once you run your called_notebook as your first cell then the created temp view directly available to query in your caller notebook.

PFB sample code in caller_notebook:

cell 1:

%run ./Called_Notebook

cell 2:

df = spark.table("myview")
df.display()

or also you can try 

%sql SELECT * FROM myview

Please try it and accept as solution if its working as per your requirement

Nagesh Patil

HI @NageshPatil : Welcome to the Databricks Community forum and sharing your thoughts. As mentioned in my original post above, Databricks Free Edition uses serverless cluster only - hence I cannot use SQL Warehouse cluster. Moreover, as mentioned in post, I'm working on passing structured data between two notebooks (that is, not between two cells of a notebook).

balajij8
Contributor

Hi @Saf4Databricks 

You can use the below.

Called_Notebook:

spark.range(5).toDF("MyCol").createOrReplaceTempView("MyView")

Caller_Notebook:

%run "./Notebook"

 

display(table("MyView"))

 MyCol
0
1
2
3
4

Hi @balajij8, I'm already aware of the use of %run command. I'm working on practicing the use of dbutils.notebook.run() API since I'm working on cases where %run command does not work.

The dbutils.notebook API complements %run because it lets you pass parameters to and return values from a notebook. Likewise, you can use dbutils.notebook.run() to invoke an R notebook - but you can't do it with %run command.