cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

SQL endpoint is unable to connect to external hive metastore ( Azure databricks)

prasadvaze
Valued Contributor II

Using Azure databricks, I have set up SQL Endpoint with the connection details that match with global init script. I am able to browse tables from regular cluster in Data Engineering module but i get below error when trying a query using SQL Endpoint

*****************

on the schema browser left pane error : - Failed to load Databases list.

on the query results pane error:-

Error running query

org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient

************************

The SQL end point settings are as below - (user id and passwd are in azure key vault)

************************

spark.sql.hive.metastore.* true

spark.hadoop.fs.azure.account.auth.type.<storage_acct>.dfs.core.windows.net OAuth

spark.hadoop.fs.azure.account.oauth.provider.type.<storage_acct>.dfs.core.windows.net org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider

spark.hadoop.fs.azure.account.oauth2.client.id.<storage_acct>.dfs.core.windows.net {{secrets/<secret_scope>/client-id}}

spark.hadoop.fs.azure.account.oauth2.client.secret.<storage_acct>.dfs.core.windows.net {{secrets/<secret_scope>/client-secret}}

spark.hadoop.fs.azure.account.oauth2.client.endpoint.<storage_acct>.dfs.core.windows.net https://login.microsoftonline.com/<tenant_id>/oauth2/token

spark.hadoop.javax.jdo.option.ConnectionURL jdbc://<sql server name>.database.windows.net:1433;database=<db_name>

spark.hadoop.javax.jdo.option.ConnectionUserName metastoredbuser

spark.hadoop.javax.jdo.option.ConnectionPassword {{secrets/<secret_scope>/metastoredbpwd}}

spark.hadoop.javax.jdo.option.ConnectionDriverName com.microsoft.sqlserver.jdbc.SQLServerDriver

spark.sql.hive.metastore.version 2.3.7

spark.sql.hive.metastore.jars /databricks/hive_metastore_jars/*

************************

1 ACCEPTED SOLUTION

Accepted Solutions

Prabakar
Esteemed Contributor III
Esteemed Contributor III

DBFS support for SQL is still not available and we have a feature request for this. So using the init script won't help. You need to get the jar from Maven for now.

View solution in original post

8 REPLIES 8

Prabakar
Esteemed Contributor III
Esteemed Contributor III

Hi @prasad vaze​  instead of

spark.sql.hive.metastore.jars /databricks/hive_metastore_jars/*

please use

spark.sql.hive.metastore.jars maven

and let me know if it works.

prasadvaze
Valued Contributor II

This worked. But I am puzzled -- in another workspace I have downloaded the jars from maven and stored on DBFS. And the sql end point there is SUCCESSFULLY using below "spark.sql.hive.metastore.jars /databricks/hive_metastore_jars/*" .. So how is that working? This is confusing or is it a bug?

Prabakar
Esteemed Contributor III
Esteemed Contributor III

DBFS support for SQL is still not available and we have a feature request for this. So using the init script won't help. You need to get the jar from Maven for now.

Hi @prasad vaze​ , Does @Prabakar Ammeappin​ 's response answer your query?

prasadvaze
Valued Contributor II

yes @Kaniz Fatma​  but I have follow up question in this thread.

Thanks Prabhakar

prasadvaze
Valued Contributor II

Is DBFS support for SQL end point coming ? when?

As I mentioned in previous reply i see that in one of my workspaces using DBFS location for jars from SQL end point works but for another workspace it is not (but using maven works). The later workspace was created 3 days ago and the previous one was created 3 months ago. So something has changed behind the scenes for databricks SQL and I don't know where to find about upcoming changes. What's the future guidance ? use maven coordinates for jars or download jars into DBFS?

prasadvaze
Valued Contributor II

@Prabakar Ammeappin​  @Kaniz Fatma​  Also I found out that after delta table is created in external metastore (and the table data resides in ADLS) then in the sql end point settings I do not need to provide ADLS connection details. I only provided metastore connection details and sql endpoint was able to query the delta table. It is because metastore table stores the ADLS data folder name and path of delta table ( I have yet to find out the exact table in metastore that stores delta table's ADLS folder location , if you know this then pls let me know)

Hi @prasad vaze​ ,

  • To find the paths of all the tables you want to check.

Managed tables in the default location are stored at spark.conf.get("spark.sql.warehouse.dir") + s"/$tableName"

 If you have external tables, it is better to use catalog.listTables() followed by catalog.getTableMetadata(ident).location.getPath

Any other paths can be used directly.

Determine which paths belong to Delta tables using DeltaTable.isDeltaTable(path)

Hope this helps.

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!