cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks SQL External Connections

Tewks
New Contributor

Lakehouse architectures seem enticing, especially from the standpoint of querying the data lake directly as it sits (as opposed to first migrating the data to an external data warehouse). While documentation and support seems pretty clear regarding support BI platforms like Tableau and Power BI, there is really no reference for using it as source for queries through custom APIs.

I'm wondering if anyone has recommendations for querying Databricks SQL from .NET Web APIs through ODBC connections? Is this even a supported or recommended use case? Would the recommendation for these types of queries be to execute them against more of a traditional data warehouse? If it's supported, what are the downsides of querying directly through Databricks SQL (Cost? Performance? Tooling?). I'm guessing I'd be somewhat limited since I'd need to install the Databricks ODBC driver in order to query it? Also, it seems there are limits to the number of concurret calls that can be made per cluster?

Many of these limitations make me think I should still push data to external data warehouses for these scenarios, but I wanted to see what everyone out there thinks.

Thanks!

Geoff

1 ACCEPTED SOLUTION

Accepted Solutions

NateAnth
Valued Contributor
Valued Contributor

That is the wonderful thing about the Lakehouse, data is in open formats with open API's. Please see these options for querying via GoNode.jsPython as well as via API

https://www.databricks.com/blog/2023/03/07/databricks-sql-statement-execution-api-announcing-public-...

https://www.databricks.com/blog/2022/06/29/connect-from-anywhere-to-databricks-sql.html

As you said, you can also download and install the ODBC driver to connect from different applications:

https://docs.databricks.com/integrations/jdbc-odbc-bi.html

Databricks SQL Warehouses can scale vertically for data throughput as well as horizontally for concurrency, please review your concurrency requirements with your Databricks account team for specific guidance.

View solution in original post

2 REPLIES 2

NateAnth
Valued Contributor
Valued Contributor

That is the wonderful thing about the Lakehouse, data is in open formats with open API's. Please see these options for querying via GoNode.jsPython as well as via API

https://www.databricks.com/blog/2023/03/07/databricks-sql-statement-execution-api-announcing-public-...

https://www.databricks.com/blog/2022/06/29/connect-from-anywhere-to-databricks-sql.html

As you said, you can also download and install the ODBC driver to connect from different applications:

https://docs.databricks.com/integrations/jdbc-odbc-bi.html

Databricks SQL Warehouses can scale vertically for data throughput as well as horizontally for concurrency, please review your concurrency requirements with your Databricks account team for specific guidance.

Aviral-Bhardwaj
Esteemed Contributor III

these are really awesome details

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.