cancel
Showing results for 
Search instead for 
Did you mean: 
Warehousing & Analytics
cancel
Showing results for 
Search instead for 
Did you mean: 

SQL endpoint JDBC table support

RicksDB
Contributor II

Hi,

is there any way/workaround to query JDBC tables the same way one can do with other type of clusters?

Doing so right now causes an error saying that only text based files are supported (json, parquet, delta etc) even though the tables are recognized in the sql workspace.

13 REPLIES 13

Anonymous
Not applicable

Hello @E H​!

My name is Piper and I'm one of the community moderators. I wanted to pop in and thank you for your question. I'm sure that a fellow community member will be by to answer your question shortly. If not, the team will be back on Monday.

Cheers!

Sebastian
Contributor

To my understanding SQL Endpoint is connected to your hive metastore. One possible way is to connect your metastore to the corresponding JDBC table so that metadata is available for SQL endpoint to connect.

RicksDB
Contributor II

Hi Sebastian, thank you for your answer.

You are right, SQL endpoint uses the same metastore as the other clusters. Therefore, after creating a JDBC table in a high currency cluster, I do see the table in the sql analytics workspace. The table is recognized and the schema is even visible. However, when querying the table , an error occurs explaining than JDBC is not supported, only text based file are. The same query works correctly when using the high currency cluster.

BilalAslamDbrx
Honored Contributor II
Honored Contributor II

@E H​ can you share a screenshot of the error in Databricks SQL? And also a screenshot of the query working in the Data science & engineering workspace?

RicksDB
Contributor II

Hi Muhammed,

Here are the screenshots querying the same table. The first one is in the Analytics SQL workspace. The second one in the engineering workspace.

SQL Endpoint

sqlEndpoint 

High Concurrency (Or any other clusters beside sql clusters)

HighConcurrency

RicksDB
Contributor II

To add more information to the topic,

I've seen this element within the release notes. Is this related? Was JDBC supported but deactivated?

ExternalDatasource 

BilalAslamDbrx
Honored Contributor II
Honored Contributor II

@E H​ I thought I replied, but apparently I didn't -- apologies for the late response. So there are two different things in play:

  • JDBC tables are supported by Apache Spark and as you can see in your screenshot, they work just fine in the Data Science & Engineering Workspace. However, they do NOT work in Databricks SQL. However, we are interested in supporting data sources like these but we're likely to support Redshift, MySQL etc. before we get to generic JDBC support. So, to be clear, the JDBC data source never worked in DBSQL.
  • External Data Sources are different. They are part of the open source Redash product, but we are not planning on developing this feature right now.

What actual engine are you connecting to with the JDBC data source? Is it MySQL, Postgres etc?

Being able to join data coming both from the delta lake and Azure SQL would be an excellent feature for SQL Analytics - providing us the option to supply the reporting community (Power BI) with a single data hub.

RicksDB
Contributor II

Thanks for the clarifications @Bilal Aslam​ .

The query above was executed on an Azure sql managed instance datasource.

Thanks,

Eric

RicksDB
Contributor II

Any news regarding this feature?

BilalAslamDbrx
Honored Contributor II
Honored Contributor II

@E H​ nope, not yet. It's definitely on our list of things to figure out and support.

dimsh
Contributor

+1

It would be great to have this option. I'm on the way to building a Data Analytics Platform based on Databricks SQL for my customer. Databricks SQL is really cool product, but it seems it doesn't support reading data from PostgreSQL (Azure DBaaS). For me, it just shows a shema and that's all. Works well in the Data Engineering workspace. Please consider this feature.

RyanD-AgCountry
Contributor

I've been waiting patiently for this option since public preview early 2021. The vast majority of our data is in SQL Server databases, and because we are unable to query these data sources is the primary reason the data team hasn't adopted SQL Workspace as a solution. We support JDBC connectivity, and it is nice that the metastore works, but without being able to query the data is in my opinion oversight.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.