cancel
Showing results for 
Search instead for 
Did you mean: 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results for 
Search instead for 
Did you mean: 

Set default database thru Cluster Spark Configuration

adrianhernandez
New Contributor III

Set the default catalog (AKA default SQL Database) in a Cluster's Spark configuration. I've tried the following :

spark.catalog.setCurrentDatabase("cbp_reporting_gold_preprod") - this works in a Notebook but doesn't do anything in the Cluster.

spark.sql.catalog.spark_catalog.defaultDatabase("cbp_reporting_gold_preprod")

In the Spark config I am entering a slightly different syntax (w/o the parenthesis or quotes). Error logs do not show errors running these commands, but, they simply do not work. I use following command in a notebook to test for this :

spark.catalog.currentDatabase()

The end goal is to set several options in the Cluster so user's can simply query their data using Spark SQL and not be concerned with the database where their tables are located. I've googled extensively for days, have not found a solution yet. Wondering why this works in a notebook spark.catalog.setCurrentDatabase("cbp_reporting_gold_preprod"), but trying this in the Spark Configuration spark.catalog.setCurrentDatabase cbp_reporting_gold_preprod doesn't seem to do anything.

2 REPLIES 2

Kaniz_Fatma
Community Manager
Community Manager

Hi @adrianhernandez , Based on the information provided, to set the default catalog (default SQL Database) in a Cluster's Spark configuration, you can use the Spark configuration property spark.databricks.sql.initial.catalog.name. This configuration property allows you to override the default catalog for a specific cluster.

Here is how you can set this configuration:

python
spark.conf.set("spark.databricks.sql.initial.catalog.name", "cbp_reporting_gold_preprod")

Keep in mind that this configuration needs to be set before starting the SparkSession.

Also, remember that changing the default catalog can break existing data operations that depend on it.

adrianhernandez
New Contributor III

I've tried different commands in the Cluster's Spark Config, none work, they execute at Cluster startup w/o any errors shown in the logs, but once you run a notebook attached to the cluster Default catalog is still set to 'default'.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group