Databricks Community

Oliver_Floyd · ‎02-03-2022

Hello,

In my use case, my data is pushed to an adls gen2 container called ingest

After some data processing on a databricks cluster of the ingest workspace, I declare the associated table in an external metastore for this workspace

At the end of this processing (according to certain criteria) I push the curated data (a simple copy) to other containers (lab/qal/prd, each container contains data for a databricks workspace)

and I want to declare the metastores for these 3 workspaces.

One solution is to launch 3 tasks after this first task. Each cluster associated with these tasks is configured with the metastore of each databricks workspace. It works but this solution is cumbersome:

need to start a cluster for each workspace
even if the table is already declared in the metastore, you have to start the cluster to check.
slow down our data process

Another solution could be to update the cluster configuration on the fly in the first task. I tried to modify the spark session configuration with the above lines of code:

spark.sparkContext.getConf().set("spark.hadoop.javax.jdo.option.ConnectionURL","jdbc:sqlserver://lab_env.database.windows.net:1433;database=labdatabase")

Or

spark.conf.set("spark.hadoop.javax.jdo.option.ConnectionURL","jdbc:sqlserver://lab_env.database.windows.net:1433;database=labdatabase")

but it seems that it doesn't work.

My question is simple : Do you know if there is a way to change this configuration in a notebook, or if it is not possible at all

Thanking you in advance for your help

Atanu · ‎02-12-2022

Hi @oliv vier as per our doc this can be achieved by only through

Spark config
Init script

So I think , on the fly it won't work. Thanks. But may be you can have this as feature request to our product team.

View solution in original post

Atanu · ‎02-12-2022

Hi @oliv vier as per our doc this can be achieved by only through

Spark config
Init script

So I think , on the fly it won't work. Thanks. But may be you can have this as feature request to our product team.

Oliver_Floyd · ‎03-07-2022

Hello @Atanu Sarkar ,

Thank you for your answer. I have created a feature request. I hope, it will be soon accepted ^^

Databricks Community

How to update external metastore cluster configuration on the fly ?

Join Us as a Local Community Builder!

🌟 Community Sparks of the Week | September 26 – October 2 🌟

Solution Accelerator Series | #4 - Toxicity Detection for Gaming

Level Up with Databricks Specialist Sessions

🚀 Weekly Delta (24-30 September): A Look Back at This Week’s Top Community Highlights!

Announcing Data Intelligence for Cybersecurity