cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to update external metastore cluster configuration on the fly ?

Oliver_Floyd
Contributor

Hello,

In my use case, my data is pushed to an adls gen2 container called ingest

After some data processing on a databricks cluster of the ingest workspace, I declare the associated table in an external metastore for this workspace

At the end of this processing (according to certain criteria) I push the curated data (a simple copy) to other containers (lab/qal/prd, each container contains data for a databricks workspace)

 and I want to declare the metastores for these 3 workspaces.

One solution is to launch 3 tasks after this first task. Each cluster associated with these tasks is configured with the metastore of each databricks workspace. It works but this solution is cumbersome:

  • need to start a cluster for each workspace
  • even if the table is already declared in the metastore, you have to start the cluster to check.
  • slow down our data process

Another solution could be to update the cluster configuration on the fly in the first task. I tried to modify the spark session configuration with the above lines of code:

spark.sparkContext.getConf().set("spark.hadoop.javax.jdo.option.ConnectionURL","jdbc:sqlserver://lab_env.database.windows.net:1433;database=labdatabase")

Or

spark.conf.set("spark.hadoop.javax.jdo.option.ConnectionURL","jdbc:sqlserver://lab_env.database.windows.net:1433;database=labdatabase")

but it seems that it doesn't work.

My question is simple : Do you know if there is a way to change this configuration in a notebook, or if it is not possible at all

Thanking you in advance for your help

1 ACCEPTED SOLUTION

Accepted Solutions

Atanu
Esteemed Contributor

Hi @oliv vier​  as per our doc this can be achieved by only through

  1. Spark config
  2. Init script

So I think , on the fly it won't work. Thanks. But may be you can have this as feature request to our product team.

View solution in original post

3 REPLIES 3

Kaniz_Fatma
Community Manager
Community Manager

Hi @ Oliver Floyd! My name is Kaniz, and I'm the technical moderator here. Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question first. Or else I will get back to you soon. Thanks.

Atanu
Esteemed Contributor

Hi @oliv vier​  as per our doc this can be achieved by only through

  1. Spark config
  2. Init script

So I think , on the fly it won't work. Thanks. But may be you can have this as feature request to our product team.

Oliver_Floyd
Contributor

Hello @Atanu Sarkar​ ,

Thank you for your answer. I have created a feature request. I hope, it will be soon accepted ^^

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group