cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks SQL External Connections

Tewks
New Contributor

Lakehouse architectures seem enticing, especially from the standpoint of querying the data lake directly as it sits (as opposed to first migrating the data to an external data warehouse). While documentation and support seems pretty clear regarding support BI platforms like Tableau and Power BI, there is really no reference for using it as source for queries through custom APIs.

I'm wondering if anyone has recommendations for querying Databricks SQL from .NET Web APIs through ODBC connections? Is this even a supported or recommended use case? Would the recommendation for these types of queries be to execute them against more of a traditional data warehouse? If it's supported, what are the downsides of querying directly through Databricks SQL (Cost? Performance? Tooling?). I'm guessing I'd be somewhat limited since I'd need to install the Databricks ODBC driver in order to query it? Also, it seems there are limits to the number of concurret calls that can be made per cluster?

Many of these limitations make me think I should still push data to external data warehouses for these scenarios, but I wanted to see what everyone out there thinks.

Thanks!

Geoff

1 ACCEPTED SOLUTION

Accepted Solutions

NateAnth
Databricks Employee
Databricks Employee

That is the wonderful thing about the Lakehouse, data is in open formats with open API's. Please see these options for querying via GoNode.jsPython as well as via API

https://www.databricks.com/blog/2023/03/07/databricks-sql-statement-execution-api-announcing-public-...

https://www.databricks.com/blog/2022/06/29/connect-from-anywhere-to-databricks-sql.html

As you said, you can also download and install the ODBC driver to connect from different applications:

https://docs.databricks.com/integrations/jdbc-odbc-bi.html

Databricks SQL Warehouses can scale vertically for data throughput as well as horizontally for concurrency, please review your concurrency requirements with your Databricks account team for specific guidance.

View solution in original post

2 REPLIES 2

NateAnth
Databricks Employee
Databricks Employee

That is the wonderful thing about the Lakehouse, data is in open formats with open API's. Please see these options for querying via GoNode.jsPython as well as via API

https://www.databricks.com/blog/2023/03/07/databricks-sql-statement-execution-api-announcing-public-...

https://www.databricks.com/blog/2022/06/29/connect-from-anywhere-to-databricks-sql.html

As you said, you can also download and install the ODBC driver to connect from different applications:

https://docs.databricks.com/integrations/jdbc-odbc-bi.html

Databricks SQL Warehouses can scale vertically for data throughput as well as horizontally for concurrency, please review your concurrency requirements with your Databricks account team for specific guidance.

Aviral-Bhardwaj
Esteemed Contributor III

these are really awesome details

AviralBhardwaj

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group