Re: Can I get more details on the performance diff...

Travis84 · ‎08-26-2025

This article (Connect Python and pyodbc to Databricks | Databricks on AWS) states the following

"However pyodbc may have better performance when fetching queries results above 10 MB."

This is a bit vague. The word "may" implies "maybe not". Also, "better performance" is not quantitative. How much better? Are there any benchmarking studies? I have not been able to find out more information.

SP_6721 · ‎08-26-2025

Hi @Travis84 ,

The documentation sounds vague but real-world performance depends on many factors. For clarity, the best approach is to run a simple test by executing the same large query with both pyodbc and the Databricks SQL Connector in your environment.

Coffee77 · ‎08-27-2025

I can't give you a comparison but at least in my case, Spark SQL Connector over SQL Server is behaving pretty fine when retrieving moderate amount of rows from SQL tables. As said in previous commands, it depends on multiple factors, not only driver but database design as well. In any case, I started using that connector as needed a OLTP system integrated with Databricks Lakehouse. So, I created a set of functions to interact with SQL Server and at least until now, good performance. With new addition of Databricks Lakebase over PostgreSQL, maybe I need to upgrade...opssss Saying this because perhaps this new feature can be a good fit for you as well.

Lifelong Solution Architect Learner | Coffee & Data

WiliamRosa · ‎08-27-2025

Hi @Travis84,

Hi,
I came across an article that might help you, which makes the following comparison:
A blog on high-bandwidth connections using Databricks’ Cloud Fetch optimization (leveraging parallel data transfer via pre-signed URLs) reported up to 12× faster extract throughput for very large datasets (~3.4 GB) when using ODBC-based tooling.
https://www.databricks.com/blog/2021/08/11/how-we-achieved-high-bandwidth-connectivity-with-bi-tools...

Wiliam Rosa
Data Engineer | Machine Learning Engineer
LinkedIn: linkedin.com/in/wiliamrosa

Travis84 · ‎08-27-2025

Interesting article but it seems that Cloud Fetch is supported on both odbc and sql connector

Driver capability settings for the Databricks ODBC Driver | Databricks on AWS

Databricks SQL Connector for Python | Databricks on AWS (see section getting started)

Can I get more details on the performance differences between pyodbc and SQL Connector for Python?