topic Re: Can I get more details on the performance differences between pyodbc and SQL Connector for Pytho in Data Engineering

Can I get more details on the performance differences between pyodbc and SQL Connector for Python?

Travis84 — Tue, 26 Aug 2025 07:24:18 GMT

This article (Connect Python and pyodbc to Databricks | Databricks on AWS) states the following

"However pyodbc may have better performance when fetching queries results above 10 MB."

This is a bit vague. The word "may" implies "maybe not". Also, "better performance" is not quantitative. How much better? Are there any benchmarking studies? I have not been able to find out more information.

Re: Can I get more details on the performance differences between pyodbc and SQL Connector for Pytho

SP_6721 — Tue, 26 Aug 2025 14:00:59 GMT

Hi @Travis84 ,

The documentation sounds vague but real-world performance depends on many factors. For clarity, the best approach is to run a simple test by executing the same large query with both pyodbc and the Databricks SQL Connector in your environment.

Re: Can I get more details on the performance differences between pyodbc and SQL Connector for Pytho

Coffee77 — Wed, 27 Aug 2025 08:57:31 GMT

I can't give you a comparison but at least in my case, Spark SQL Connector over SQL Server is behaving pretty fine when retrieving moderate amount of rows from SQL tables. As said in previous commands, it depends on multiple factors, not only driver but database design as well. In any case, I started using that connector as needed a OLTP system integrated with Databricks Lakehouse. So, I created a set of functions to interact with SQL Server and at least until now, good performance. With new addition of Databricks Lakebase over PostgreSQL, maybe I need to upgrade...opssss Saying this because perhaps this new feature can be a good fit for you as well.

Re: Can I get more details on the performance differences between pyodbc and SQL Connector for Pytho

WiliamRosa — Wed, 27 Aug 2025 11:26:46 GMT

Hi @Travis84,

Hi,
I came across an article that might help you, which makes the following comparison:
A blog on high-bandwidth connections using Databricks’ Cloud Fetch optimization (leveraging parallel data transfer via pre-signed URLs) reported up to 12× faster extract throughput for very large datasets (~3.4 GB) when using ODBC-based tooling.
https://www.databricks.com/blog/2021/08/11/how-we-achieved-high-bandwidth-connectivity-with-bi-tools.html

Re: Can I get more details on the performance differences between pyodbc and SQL Connector for Pytho

Travis84 — Wed, 27 Aug 2025 11:59:31 GMT

Interesting article but it seems that Cloud Fetch is supported on both odbc and sql connector

Driver capability settings for the Databricks ODBC Driver | Databricks on AWS

Databricks SQL Connector for Python | Databricks on AWS (see section getting started)